![]() |
||||||||||||||||||||||||||||||||
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
|||||||||||||||||||||||||
| About Us(CRI) |
This page explains what the EOF is and how the customer can use them. This page describs a part of the data analysis services we offer at CRI. Please click "Data Analysis" button above to see other types of data analysis we offer. We prepared explanatory pages with some examples for underlined words in blue. If you want to see those pages, please click underlined words in blue below. What is EOF analysis? Why should I bother computing EOF? |
|||||||||||||||||||||||||||||||
| We compute EOF for you.
Estimations are free. For more information, |
||||||||||||||||||||||||||||||||
![]() |
The zero levels for each record are shown as horizontal black lines. We downloaded these data from NOAA, U.S.A. (http://www.pmel.noaa.gov/tao) and applied a band-pass filter to remove variations of period longer than 150 days and shorter than 20 days. If you are not interested in ocean current, you could consider these as some kind of time series data, such as sale records, obtained at different locations, such as different provinces, or a group of time series of different kind at same or different locations. | |||||||||||||||||||||||||||||||
| This figure suggests that the variations of current at 100m are inversely correlated with those at 50m while those at 200m might be inversely correlated with those at 100m. There are data at 23 depths after ignoring bad data but studying the relations among those 23 time series data separately like this way is an overwhelming task, at least, to us. So we applied EOF to extract common variations. Figure 2a shows time series plots of two major components extracted from original time series data. Let us call these components mode 1 (blue line) and mode 2 (red line) following oceanographic tradition. Although we admit that it looks like they are somewhat correlated, their correlation coefficient is actually zero (yes, we actually computed). |
||||||||||||||||||||||||||||||||
![]() |
Figure 2b shows how much variation, which is called variance if you allow us to use statistical terminology, in the original time series data (all 23 of them) these components explain. The vertical axis of this figure is the percentage and the horizontal axis is the mode number (there are 23 modes). The mode 1 explains 53.5% of variance and the mode 2 explains 14.3% of variance exist in the group of original time series data. This information is what we can obtain from the eigenvalues. The variances explained by higher mode decrease progressively as the mode number increases. Thus, we can concentrate our analysis efforts onto a few lower modes but can ignore many higher modes without losing too much information. | |||||||||||||||||||||||||||||||
| In this case if we can successfully analyze mode 1 and mode 2, then we practically analyzed 67.8% of variations included in the original 23 sets of time series data. Analyzing just only two sets of time series would considerably reduce our efforts/costs to analyze original 23 sets of time series data and this is why EOF is a useful tool.
Figure 2c, the eigenvectors of mode 1 and 2, shows how the amplitude of variations plotted in Figure 2a varies at different depths. The mode 1 is positive at 50m, negative at 100m and positive at 200m. This means that the variations at 100m look like mirror images of those at 50m and at 200m except that their amplitudes are different. This pattern matches our previous description of Figure 1. The amplitudes of the variations of mode1 at 100m and at 200m are about a half (0.44 to be precise) and about a quarter (0.22) of that at 50m, respectively. Here is another advantage of using EOF. Now we know how the amplitude shown in Figure 2a varies at different depths quantitatively. The information like this would help us to know what caused these variations. Even if we do not need to know the cause of these modes, knowing the amplitude of them at different depths quantitatively instead of qualitatively would be probably nice. |
||||||||||||||||||||||||||||||||
![]() |
||||||||||||||||||||||||||||||||
| The correlation coefficient has a value ranging -1 to 1. If the value is -1, two variables are perfectly correlated but in the opposite direction(like a mirror image). If the value is zero, they are not correlated at all. If the value is one, they are perfectly correlated and they vary in the same direction. | This figure suggests that mode 1 variations of ocean current at this location are related to the wind speed variations. The variations of ocean current lag behind those of wind. The value of the correlation coefficient arrives its maximum, 0.68, when we shift wind data to the right by 9 days. This correlation is statistically significant (We computed 95% confidence interval assuming effective sampling interval is 10 days since the cut-off period of a band-pass filter we applied is 20 days.). | |||||||||||||||||||||||||||||||
| From figure 2c wind influence is such that the ocean current near the surface and at depths below 160m is accelerated in the down-wind direction but it is accelerated against wind direction at depths between about 80 and 160m.
We will stop our analysis at this point since this is not a scientific paper, but we would like to mention that we published results of more detailed analysis applied to the older data at the same location in a scientific journal.
|
||||||||||||||||||||||||||||||||
![]() |
Figure 4b shows time series plots of mode1 and mode2. Figure 4c shows eigenvalues of this experiment. Mode 1 contains only about 60% of variance of the input data set and mode 2 contains about 34% of it. Figure 4d shows eigenvectors of mode 1 and mode 2. Apparently mode 2 is no longer negligible and mode 1 has amplitude variations at different "depths" although all the time series data are exactly the same except of the time shift among them. You might also notice that the time series plots of mode 1 and 2 look suspiciously similar.
Again, correlation coefficient between them is zero by principle but it becomes 0.81 if we shift mode 2 time series data to the left by 13.3. The zero-correlation is guaranteed only if we do not shift resultant time series data at all. To avoid result like this we have to adjust data set before computing EOF as described before if we know that the variations in time series data have time lag among them. Alternatively we might apply complex EOF of time domain or frequency domain EOF instead if we are not sure how far we need to shift original data. We will describe these methods later in this page. |
|||||||||||||||||||||||||||||||
![]() |
||||||||||||||||||||||||||||||||
| (2) Eigenvectors and eigenvalues are supposed to be reasonably constant The EOF produces only one set of eigenvalues (Figure 2b) and eigenvectors (Figure 2c). If these are not constant in time, then the result of EOF might become hard to interpret or, at worst, meaningless. Actually there is a good physical reason to believe that the eigenvectors might not be constant in time in our example. Then, what we have done was to create a time series of eigenvector of mode 1 in the following manner. First, we computed EOF with initial 91.25-day (1/4-year) long segment of data. Then, we computed another EOF with another 91.25-day long segments, start date of that segment is shifted forward in time by half of 91.25days. We repeated his procedure until we reached the end of the data. Figure 5, the result of this computation, shows how the eigenvector of mode 1 changes in time. This figure shows that eigenvector has a two-layer structure, negative near the surface and positive below, at the beginning of the data. |
||||||||||||||||||||||||||||||||
![]() |
Then, it changes to a three-layer structure from the beginning of 2003 and this three-layer structure continues throughout the record. Thus, in our example (Figure 2 and 3), we started computation from February 2003 but discarded data prior to that month. |
|||||||||||||||||||||||||||||||
| (3) We might need to do some pre-processings before computing EOF It might be better if we apply some pre-processings to our data before computing EOF. In case of our example we know that there are strong tidal signals in our data. We know also that there are variations of periods of half a year and one year. We are not interested in these variations. Also, we have an idea at which frequencies wind has a strong influence on ocean currents through coherency function analysis. Thus, we applied a band-pass filter to our data before computing EOF based on this prior knowledge. Another important point here is that single external factor might have influences to our data by several different mechanisms via different ways (or routes). The responses caused by the same factor but by these different mechanisms might not be proportional. For example, certain mechanisms might dump variations of shorter period while others might amplify them. It might become difficult to interpret time series produced by EOF as a result of this. Wind affects ocean currents on the equator in several different ways in our example. We have a theoretical reason to believe that the mechanism by which wind affects to ocean current where eigenvector is positive near the surface and the mechanism at work where eigenvector is negative are different. One of the methods we can try in case like this is remove some data from our data set. So, we re-calculate EOF using data only between 40m and 80m (5 time series data). Here, we might say we "filtered" our data allowing only those at depths between 40 and 80m. Figure 3b, the result of this re-calculation, shows shorter periods variations such as "dual-bump" features more clearly than Figure 3a does. If we mix time series data with different units, we usually need to adjust their amplitude unless we use a correlation matrix to compute EOF. This process is called weighting and multiplying different constants to each of these time series often does it. We usually remove average and often remove trend from each time series data before computing EOF. Using a correlation matrix to compute EOF is equivalent to adjusting amplitude of input data by dividing input data by the square root of variance of them before computing EOF with a covariance matrix. By doing so all the time series data will have an equal importance (weight) in EOF computation. |
||||||||||||||||||||||||||||||||
| Click below for more about EOF. |
||||||||||||||||||||||||||||||||
![]() |
![]() |
|||||||||||||||||||||||||||||||