About Us(CRI) | |||||||||||||||||

Estimations are free. For more information, please send a mail -->here<-- |
|||||||||||||||||

Power Spectral Density computation (Spectral Analysis) Page 4 of PD001A/B User Guide Table of Content Appendix 1; About numerical error generated by computers. One of the ways to compare S' with S is comparing outline, usually called envelope, of S' with envelope of S. Figure 2-29 shows part of S in Figure 2-19 and its upper side envelope. This upper side envelope is equal to (please see equation (7)) Figure 2-30 shows upper side envelopes of S (red solid line), S' by the Method 1 (blue solid line) and S' by the Method 2 (blue dashed line). Upper three graphs are the results when we used noiseless data S and lower three graphs are the results when we used noise added data. We can only see a blue solid line in the uppermost graph but this is because S and S' are almost identical in this case. These graphs suggest that (a) addition of noise of this magnitude does not affect accuracy of reproduced signal (S') that much, (b) accuracy of reproduced signal becomes best in the middle of data but degrades near the both end of data and (c) mismatch of actual signal frequencies and frequencies of signal bins does cause relatively large error of signal reproduction. The reason of (b) is as follows. f1 and f2 in equation (11) of S' differ from those of S if frequencies of signal bins differ from actual signal frequencies. Then if envelope of S' match envelop of S fairly well at position X, discrepancies between them tend to grow as we move away from X. In our case, X is located at the middle point of data used for amplitude spectrum computation (portion of time series left of vertical black lines). Noise is not the direct cause of this phenomenon. Figure 2-32 shows how amplitude and phase of signal bins vary as we add more data. If we compare these graphs with Figure 2-20, we can see that the amplitudes of signal bins deviate much more from their respective signal amplitudes this time. These graph suggest that a large amount of data is necessary to get fairly accurate signal amplitudes. Although we have chosen the same data lengths for detailed inspection shown below for the sake of consistency, those data lengths are likely too short to get relatively accurate signal amplitudes and phases this time. Figure 2-33 shows amplitude spectra. Data lengths for these computations are the same as before and indicated in Figure 2-32. One of the important differences between these spectra and those shown in Figure 2-24 is that peaks generated by noise are as large as signal peaks. In other words signal peaks are no longer easily identifiable. If we do not have prior knowledge of signal frequencies, we will miss most of them in this example. As of the accuracy of amplitude relative to those of previous example, it is somewhat a mixed bag. Figure 2-34 shows upper side envelopes of reproduced signal (S') as before. We omit S' using noiseless data in this graph because they are the same as before (Figure 2-30). Considering that the amplitude of noise relative to signal in this example is very large, we feel that reproduction of signal by these methods does fairy good job but this is just our subjective opinion. Probably the real problem in this example is that it is very difficult to know which peak in amplitude spectrum is signal and which peak is noise (Figure 2-33). (2-1-3-6) Alternative method of amplitude estimation To apply this method we need to know exact signal frequencies but we do not need to care about frequency match. Also, this method does not demand constant sampling interval. The results we show below make us feel this method is more convenient than PSD computation for amplitude estimation. However, this method has several disadvantages. Accuracy of this method critically depends on the accuracy of the knowledge of frequencies to look at. It would take too much time to compute amplitudes of thousands of frequencies. The method we show here tries to approximate data as a summation of signals and error where error does not correlate with any of the signal. This can be expressed as In this equation "Data" and frequencies of signals (Fm) are given and we evaluate amplitudes (Hm) and phases (Qm). "Error" is a by-product but we usually do not use it. The important thing to be noted here is that Um in this equation is NOT equal to Vm in equation (10) and 'Error' in this equation is NOT equal to 'Noise' in equation (10). Equation (12) shows the relation between 'Data' and result of statistical computation (Hm and Qm are statistically derived values) applied to data while equation (10) shows construction of 'Data'. These two equations are fundamentally different. We generated 'Noise' in equation (10) using random number generator. The correlation between this 'Noise' and V0, V1, V2, V3 or V4 is very small and we ordinarily say that 'Noise' is not statistically correlated with any of Vms but that does not guarantee correlation coefficients between them are all zero. The method we describe here makes correlation coefficients between 'Error' and U0, U1, U2, U3 or U4 in equation (12) zero within computational accuracy. Figure 2-35 shows how computed amplitudes of U0, U1, U2, U3 and U4 vary as we add more data when we use data shown in Figure 2-19. The horizontal black solid lines in these graphs show actual signal amplitudes. This graph shows that some of the computed amplitudes have consistent bias but do not oscillate like those shown in Figure 2-20. This is what we mean by "do not need to care about frequency match". If we ignore those oscillations in Figure 2-20 we can probably see the patterns common to both Figure 2-20 and Figure 2-35. This graph might make us think PSD computation is awfully inferior to this method, but we should not forget that the primary purpose of computing PSD is usually to find signal frequencies. We can use this method only if we know signal frequencies beforehand. When we use the data shown in Figure 2-31, computed amplitudes differs more as shown in Figure 2-36. If we compare this graph with Figure 2-32, we can see similar patterns as before. (2-2) Graph It is possible to use period instead of frequency for horizontal axis. In that case lineally scaled horizontal axis, Figure 2-38(a), will produce somewhat useless graph unlike Figure 2-37(a). The longest period in this example is 1250 days if we exclude the one whose period is infinite (Frequency is zero). While there are 14988 bins whose period is shorter than 100 days, there are only 13 bins whose period is longer than 100 days. This is because PSD is computed at constant frequency interval and period is the inverse of frequency. In Figure 2-37(b) we used logarithmic scaling for horizontal axis. If there are strong demands we may make axis scaling selectable in the future but at this time, we make graphs with logarithmic scaling for vertical axis, linear scaling for horizontal axis and frequency for horizontal axis similar to Figure 2-37(a). (3) How to prepare data for this package Step 1: Open target Excel file. Step 2: Select cells you wish us to process. Then drag cursor all the way to the bottom right cell of the data segment while keep pressing right button of your mouse. Please, make sure your selection contains only numbers. The selected area above contains Var B (you wish us to compute PSD of Var A and Var C but not Var B) but that will not be a problem. Step 3: Copy data. Step 4: Open new work sheet. Step 5: Paste what you have copied in a new work sheet. Step 6: Save this new work sheet as a "CSV (Comma delimited)" file. Step 7: Make a note about which column(s) is(are) data you wish us to process. (Optional) If you open saved file using Microsoft's Notepad application, it looks like below. You may have noticed that the positions of comma are not aligned vertically but that is fine. Only the important things are that number of data is the same for all row/line (in this example there are 3 data per row) and data are separated by a single comma between them. |
|||||||||||||||||