About Us(CRI) | |||||||||||||||||

Estimations are free. For more information, please send a mail -->here<-- |
|||||||||||||||||

Power Spectral Density computation (Spectral Analysis) Page 1 of PD001A/B User Guide Table of Content Appendix 1; About numerical error generated by computers. Power Spectral Density (PSD for abbreviation) is commonly used to find frequencies and amplitudes of periodic variations in data. Figure 1-1(a) shows time series of ocean current data and Figure 1-1(b) shows PSD of that data. It is easy to identify two major tidal components labeled D (diurnal; period is about 1 day) and SD (semi-diurnal; period is about half a day) in Figure 1-1(b) although they are buried under various other variations in time series plots such as Figure 1-1(a). Figure 1-1(c) shows time series of initial 240 data in which there are supposed to be 24 cycles of D and 48 cycles of SD variations, but it is very hard to find those variations. We adopted a widely used method to compute PSD. This method decomposes input data into a series of sinusoidal curves of different frequencies and then evaluates their amplitudes. Figure 1-2 shows a shape of a sinusoidal curve. If a time series plot of data shows sharp corners, sudden jumps, spikes and/or step-like shapes, PSD of that data might show somewhat confusing results. For more about PSD please click ->here<-. We designed this package deal service for customers who wish to take a quick look at a PSD of their data without spending too much time to determine the proper computational parameters shown in Table 1 below. It is usually very difficult to know the best choices of these computational parameters without actually computing and checking the result first. Therefore, we provide results of 9 different PSD computations for a single order of PD001A. (PD001B contains only one result.) In this document we describe a summary of this package deal service including price information in section (1). In section (2) we describe products (outputs) of this package deal service and possible applications of them. In section (3) we describe how to prepare data for this package deal service using Microsoft Excel. In section (4) we provide some information to help our customer choose adjustable parameters. We use words "frequency" and "period" interchangeably in this document. The relationship between them is that period is inverse of frequency; period=1/frequency (for example, period of 2 cycle/second variation is 1/2=0.5second). Higher frequency is equivalent to shorter period and lower frequency is equivalent to longer period. We use words 'time domain' and 'frequency domain'. Figure 1-1(a) is a simple presentation of data in time domain and Figure 1-1(b) is a counterpart of Figure 1-1(a) but in frequency domain. In this document we treat time series data, but if your data is a one dimensional space distribution of something such as brightness of a material scanned by a moving optical sensor, please interpret the word "time" as "space". In this document we tried to avoid using technical terms and mathematical equations as much as possible to accommodate wide range of our cutomers. We do not describe the detail of theoretical basis of PSD computation. Instead, we focus our attention on the practical aspects of PSD computations such as how the results of actual computations look like, accuracies of actual computations and such. Certain expressions we use are not mathematically and/or statistically precise. (1) Summary of PD001A/B Detrend is the procedure of removing a straight-line least square approximation of data from data and we will describe about detrending in (4-1). For the percentage of confidence interval of PSD, please see (2-1-2-4) and (2-1-3-2). The bin-width of Frequency Domain Smoothing (FDS for abbreviation) is the width (number of bin) of the un-weighted moving average we explicitly apply to PSD. This procedure is very much like the application of simple moving average to time series data to smooth a jagged line. For more detail, please see (4-3). Our customer can specify two different bin-widths for each window function but they must be odd integers (such as 3,5,7,9,11...). If our customer does not specify bin-widths, we will apply default values shown in Table 1. Here, we would like to mention that these default values might be too small to be useful if number of data is large. Since a Hanning window function has an implicit effect similar to 3 points weighted moving average, actual smoothing bin-width, shown as numbers in parenthesis, becomes wider than explicit smoothing bin-width (number our customer can specify) when a Hanning window function is applied. Similar implicit FDS may occur if our customer chose large taper ratio for a Tukey window function. If our customer select a default taper ratio the effect of an implicit FDS is very small. We describe how taper ratio affects characteristic of a Tukey window function in (4-2-10). We compute PSD and other variables at frequencies between 0 and 1/(Sampling interval multiplied by 2). Here, sampling interval is time duration between consecutive data point. The frequency interval of these values is constant and equal to 1/(Number of data multiplied by sampling interval)=1/Data length. Please, note that data length is NOT a number of data. For example if sampling interval is 5 seconds and number of data is 200, your data length is 5x200=1000 seconds, the maximum frequency is 1/(5x2)=1/10=0.1 cycle/second (or Hz), frequency interval is 1/1000=0.001 cycle/second and the number of frequencies where PSD and other variables are computed is (0.1-0)/0.001+1=101 (The last +1 comes from the fact that we compute value at 0 frequency as well). (1-2) What PD001B does (1-3) Summary of selectable computational parameters (1-4) Computational procedure (1-5) Your Data (Input data) Please note that we will not check the validity of your data for this package deal service. For example, even if all of the values in your data file are exactly zero, we will still compute PSD. For that reason we strongly recommend that you check your data by making a simple time series plot similar to Figure 1-1(a) and inspecting it visually before sending us your data file. In case your data contains some erroneous values, we could treat erroneous values as missing data and fill in the gaps by interpolation for an additional cost. If you can specify interpolation method, data to be interpolated and we do not need to check the result of interpolation, then the additional cost could be as low as only a few US dollars. If your data is a binary file or an ASCII text file but of a complicated format, we still might be able to process your data by writing a program. However, procedure like that might cost a lot (more than few hundred US dollars). Please contact us before ordering this package deal service if sampling interval of your data is not constant, your data contains erroneous values and/or your data file is not a simple ASCII text file. We will estimate the additional cost for free. If you provide us unit of your data and sampling interval, we will use that information in graphs. Otherwise, all the labels of graphs will not have any unit. In case of Figure 1-1(b), unit of data is meter/second and the unit of sampling interval is hour. (1-6) Products (1-7) Price and ordering procedure The ordering procedure is as follows; Step (1) Send us an email to notify us that you are intended to place an order. Please note that we do NOT keep your files as described below. (1-8) Confidentiality (2-1) Product files (2-1-1) Format of Product file (2-1-2) Explanation of contents of product files It is important to note that we cannot choose arbitrary frequencies. Data length and sampling interval automatically determine all the frequencies where PSD and other variables are computed. The frequencies we write in product files are these frequencies. One could consider PSD we compute is PSD of bins, centers of those are the frequencies described here. The frequency bandwidth of these bins is constant and equal to a frequency resolution. In other words, frequency coverage of a specific bin is from central frequency (as written in our product file) of that bin minus half of a frequency resolution to central frequency of that bin plus half of a frequency resolution. Using above example, frequency bandwidth=frequency resolution=0.001 cycle/second for all the bins. For the third bin, the central frequency is 0+((3-1)xfrequency interval)=2x0.001=0.002 cycle/second, the lower frequency limit is central frequency-(frequency bandwidth/2)=0.002-(0.001/2)=0.002-0.0005=0.0015 cycle/second and the upper frequency limit is central frequency+(frequency bandwidth/2)=0.002+(0.001/2)=0.002+0.0005=0.0025 cycle/second. We use this concept of bin frequently in this document. We would like to make a note that PSD and amplitude we write in our product file are the values computed at central frequencies of each bin but they are not average values within each bin. The unit of sampling interval determines the unit of frequency. For example, if our customer tells us that sampling interval is 4ms, the unit of frequency will be cycle/ms. We do not convert unit unless our customer requests it specifically. (2-1-2-2) Period (2-1-2-3) PSD In the case when we apply a window function we correct PSD. We will describe detail of this correction in Section (4-2-8). Unit of PSD is square of the unit of input data divided by unit of frequency. (2-1-2-4) Confidence interval of PSD (2-1-2-5) Percent of variance PSD of a specific bin multiplied by frequency bandwidth is equal to the variance included in that bin and the summation of variance of all of the bins except zero frequency bin is equal to the variance of time series itself. Here we assumed that the average is removed from time series. This can be expressed as, using the fact that the bandwidth is a constant, From above, the ratio of PSD of a specific bin multiplied by frequency bandwidth to the variance of time series shows how much of total energy is included in that specific bin. This is what we call 'Percent of variance'. Please note that PSD of the first bin (m=0) is PSD of zero frequency (time invariant term) and excluded from equation (1) because it is not related to the variations of data at all. If data is not detrended PSD of zero frequency is square of average of data multiplied by data length. If data is detrended PSD of zero frequency becomes zero. Another thing to be noted here is that the frequency coverage of the highest frequency bin is from the frequency of that bin minus half of a frequency resolution to the frequency of that bin. This means that the bandwidth of the highest frequency bin is half of those of others. This is because PSDs at frequencies beyond of the highest frequency are actually PSDs at negative frequencies. For this reason, we do not multiply PSD (2-1-2-6) Amplitude (Amplitude spectrum) Amplitude spectrum is the value Am in equation (3). This amplitude is a half of peak-to-peak amplitude as shown in Figure 1-2. Please note that Am does not change in time. This fact becomes important later. Our customer can use this variable to estimate amplitude of variation at a specific frequency. We correct values of amplitude when we apply a window function. This correction is slightly different from the correction applied to PSD and we will describe detail of this in section (4-2-8). Unit of amplitude is the same as the unit of data. Amplitude spectrum becomes the square root of PSD multiplied by bandwidth and 2 if a window function is not applied. When a window function is applied this relationship does not hold due to the difference of correction. We will describe more about amplitude spectrum in (2-1-3-3). We do not provide this information for the cases when we applied FDS because FDS would produce meaningless result for amplitude spectrum. We will describe this issue in (4-3-2). (2-1-2-7) Phase |
|||||||||||||||||