|Estimations are free. For more information,
please send a mail
Power Spectral Density computation (Spectral Analysis)
Page 4 of PD001A/B User Guide
Table of Content
Appendix 1; About numerical error generated by computers.
One of the ways to compare S' with S is comparing outline, usually called envelope, of S' with envelope of S. Figure 2-29 shows part of S in Figure 2-19 and its upper side envelope. This upper side envelope is equal to (please see equation (7))
Since our computed amplitude of V1 is not equal to that of V2 and computed amplitude of V3 is not equal to that of V4 as shown in Figure 2-24, in the strict sense, we cannot apply AM radio signal analogy to our computational result. Nevertheless, it provides easier way to see how much our reconstructed signal S' differs from actual signal S.
Figure 2-30 shows upper side envelopes of S (red solid line), S' by the Method 1 (blue solid line) and S' by the Method 2 (blue dashed line). Upper three graphs are the results when we used noiseless data S and lower three graphs are the results when we used noise added data. We can only see a blue solid line in Figure 2-30(a) but this is because S and S' are practically identical in this case. These graphs suggest that (a) addition of noise of this magnitude does not affect accuracy of reproduced signal (S') that much, (b) accuracy of reproduced signal becomes best in the middle of data but degrades near the both end of data and (c) mismatch of actual signal frequencies and frequencies of signal bins does cause relatively large error of signal reproduction. The reason of (b) is as follows. f1 and f2 in equation (11) of S' differ from those of S if frequencies of signal bins differ from actual signal frequencies. Then if envelope of S' matches envelop of S fairly well at position X, discrepancy between them tends to grow as we move away from X. In our case, X is located at the middle point of data used for amplitude spectrum computation (portion of time series left of vertical black lines). Noise is not the direct cause of this phenomenon.
Now, let us make noise three times larger than previous example. Figure 2-31(a) shows time series of signal (S; red), noise (N; purple) and data (S+N; blue) in this case. The signal is the same as before and the reason why S in Figure 2-31 looks smaller than S in Figure 2-19(a) is because we changed vertical scale of the graphs. Figure 2-31(b) shows expanded view of S and data. The standard deviation of noise is about 3.00 and it is difficult to see S in Data in Figure 2-31(b) unlike in the previous example (Figure 2-19(b)).
Figure 2-32 shows how amplitudes and phases of signal bins vary as we add more data. If we compare these graphs with Figure 2-20, we can see that the amplitudes of signal bins deviate much more from their respective actual signal amplitudes (horizontal black lines in Figure 2-32 (a) and (b)) this time. These graphs suggest that a large amount of data is necessary to get fairly accurate signal amplitudes. Although we have chosen the same data lengths (A, B and C in Figure 2-32 (a) and (c)) for detailed inspections described below for the sake of consistency, those data lengths are likely too short to get relatively accurate signal amplitudes and phases this time.
Figure 2-33 shows amplitude spectra. Data lengths for these computations are the same as before and indicated in Figure 2-32(a) and (c) (vertical solid blue lines labeled A, B and C). One of the important differences between these spectra and those shown in Figure 2-24 is that peaks generated by noise are as large as signal peaks. In other words signal peaks are no longer easily identifiable. If we do not have prior knowledge of signal frequencies, we will miss most of them in this example. As to the accuracy of amplitude relative to those of previous example, it is somewhat a mixed bag.
Figure 2-34 shows upper side envelopes of reproduced signals (S') as before. We omit S' using noiseless data in this graph because they are the same as before (Figure 2-30). Considering that the amplitude of noise relative to signal in this example is very large, we feel that reproduction of signal by these methods does fairy good job but this is just our subjective opinion. Probably the real problem in this example is that it is very difficult to know which peaks in amplitude spectrum are due to signal and which peaks are due to noise (Figure 2-33).
(2-1-3-6) Alternative method of amplitude estimation
To apply this method we need to know exact signal frequencies but we do not need to care about frequency match. Also, this method does not demand constant sampling interval. The results we show below make us feel this method is more convenient than PSD computation for amplitude estimation. However, this method has several disadvantages. The accuracy of the result of this method critically depends on the accuracy of the knowledge of signal frequencies. It would take too much time to compute amplitudes of thousands of frequencies.
In this equation "Data" and frequencies of signals (Fm) are given and we evaluate amplitudes (Hm) and phases (Qm). "Error" is initially unknown and will be computed by this method but it is a sort of by-product and we usually do not use it. The important thing to be noted here is that Um in this equation is NOT equal to Vm in equation (10) and "Error" in this equation is NOT equal to "Noise" in equation (10). Equation (12) shows the relationship between "Data" and result of statistical computation applied to "Data" while equation (10) shows construction of "Data". These two equations are fundamentally different. We generated "Noise" in equation (10) using random number generator. The correlation between this "Noise" and V0, V1, V2, V3 or V4 is very low and we ordinarily say that "Noise" is not statistically correlated with any of Vms but that does not mean correlation coefficients between them are all zero. The method we describe here makes correlation coefficients between "Error" and U0, U1, U2, U3 or U4 in equation (12) zero within computational accuracy.
Figure 2-35 shows how computed amplitudes of U0, U1, U2, U3 and U4 vary as we add more data when we use data shown in Figure 2-19. The horizontal black solid lines in these graphs show actual signal amplitudes. These graphs show that computed amplitudes do not oscillate like those shown in Figure 2-20. This is what we mean by "we do not need to care about frequency match". If we ignore those oscillations in Figure 2-20 we can probably see the patterns common to both Figure 2-20 and Figure 2-35. These graphs might make us think PSD computation is far inferior to this method, but we should not forget that the primary purpose of computing PSD is usually to find signal frequencies. We can use this method only if we know signal frequencies beforehand.
When we use the data shown in Figure 2-31, computed amplitudes differ more from actual signal amplitudes as shown in Figure 2-36.
It is possible to use period instead of frequency for horizontal axis. In that case lineally scaled horizontal axis, Figure 2-38(a), will produce somewhat useless graph unlike Figure 2-37(a). The longest period in this example is 1250 days if we exclude the one whose period is infinite (Frequency is zero). While there are 14988 bins whose periods are shorter than 100 days, there are only 13 bins whose periods are longer than 100 days. This is because period is the inverse of frequency. In Figure 2-37(b) we used logarithmic scaling for horizontal axis.
If there are strong demands we may make axis scaling selectable in the future but at this time, we make graphs with logarithmic scaling for vertical axis, linear scaling for horizontal axis and frequency for horizontal axis similar to Figure 2-37(a).
(3) How to prepare data for this package
Step 1: Open target Excel file.
Step 2: Select cells you wish us to process.
Then drag cursor all the way to the bottom right cell of the data segment (column F, row 30) while keep pressing right button of your mouse.
Please, make sure your selection contains only numbers. The selected area above contains Var B (you wish us to compute PSD of Var A and Var C but not Var B) but that will not be a problem.
Step 3: Copy data.
Step 4: Open new work sheet.
Step 5: Paste what you have copied in a new work sheet.
Step 6: Save this new work sheet as a "CSV (Comma delimited)" file.
Step 7: Make a note about which column(s) is(are) data you wish us to process. (Optional)
If you open saved file using Microsoft's Notepad application, it looks like below.
You may have noticed that the positions of commas are not aligned vertically but that is fine. Only the important things are that number of data is the same for all row/line (in this example there are 3 data per row) and data are separated by a single comma between them.