Surrogate Production Methods |
![]() ![]() ![]() |
Introduction
Different non-parameterized methods for generating surrogate data are found throughout related literature that attempt to preserve characteristics of an observed time series that are desired for test against some predefined null hypothesis in the so called Method of Surrogates. Throughout the literature, the favored null hypotheses for such tests were:
(1) | The observed time series is a realization of a random process. |
(2) | The observed time series is characterized by linearly autocorrelated Gaussian noise. |
(3) | The observed time series is the result of nonlinear, monotonic and invertible instantaneous filter acting on a linear stochastic time series. |
To generate data suitable for testing the null hypothesis (1) above, it is sufficient to realize random shuffles of the original time series since no temporal structure exists (i.e. the autocorrelation function does not display significant values) and preservation of the hypothetical typifying statistics, namely the first two moments (the mean and the variance), as well as the empirical distribution, is accomplished by such.
If the observed time series is characterized only by linearity, null hypothesis (2), then it is desirable to preserve the autocorrelation function (i.e. the frequency power spectrum) of the original time series in the surrogates. Accordingly, this task is accomplished by an independent uniform randomization of the frequency domain phases of the original time series with frequency symmetry preservation in order to allow inversion to a real-only surrogate series with no imaginary components.
In regards to null hypothesis (3), not only is the preservation of the autocorrelation function of the original time series desired, but also its empirical distribution, along with any possible discretization embedded within the original series. For this purpose, a new normal-distributed series (agreeing in rank with the original series) is realized from which a surrogate is created following the frequency domain phase randomization procedure explained above. The surrogate thus generated, serves as the rank order for a controlled shuffle in the original time series.
In addition to the previous method for testing against null hypothesis (3), another algorithm for surrogate data generation has been proposed from the idea of controlled shuffling and so called iterated and corrected amplitude adjusted Fourier transformed surrogates. In this case, surrogates are generated following the frequency domain randomization procedure directly from the original time series, but this time, only the rank-order of the surrogate is used as the control for a shuffle of the original time series to produce the desired surrogate.
Finally, in an attempt to correct the caveats of the Amplitude Adjusted Fourier Transform algorithm, one more algorithm has been proposed, the Iterated Amplitude Adjusted Fourier Transform, which will be briefly described in this page.
These are surrogate series generated by random shuffles of the original time series. It is accomplished by realizing time order shuffles from the original time series.
Fourier Transformed (FT) Surrogates (also known as Phase-randomized Surrogates)
These surrogate series are generated by randomizing the frequency domain phases of the original time series. Specifically achieved by Fourier-transforming the original series into the frequency domain and then multiplying the phase at each frequency by , where
is a uniformly distributed random number in the range
in such a way as to preserve symmetry,
. The inverse transform of these realizations are the desired target surrogates.
Amplitude Adjusted Fourier Transformed (AAFT) Surrogates (also known as Gaussian-scaled Surrogates)
These surrogate series are generated by a controlled shuffle of the original series based on the phase-randomized surrogate of rank-ordered Gaussian realizations. The rank ordering on the Gaussian realization takes place in agreement to the original time series rank-order. Then a FT surrogate is generated from the rank-ordered Gaussian realization, which is finally used for a controlled shuffle of the original time series. The original times series thus shuffled become the desired surrogates.
In distinction from the Gaussian-scaled surrogate method mentioned previously, the generation of these surrogates consists of rank-ordering the original time series from its direct FT surrogates.
Windowed Fourier Transformed (WFT) Surrogates
Similar to the FT surrogate method in most, except that this method requires first the original time series to be multiplied by an endpoint-vanishing function . The rational behind this filter lies in the fact that artificial high frequencies may be introduced in the process of generating the FT surrogates, which filtering seems to suppress, vanishing discontinuity jumps at the endpoints of the original time series. A side effect, though, is the introduction of a spurious low frequency component from the power spectrum of
itself. Finally, the WFT algorithm may be used as a replacement of FT in AAFT and in Fourier Shuffled productions.
Iterated Amplitude Adjusted Fourier Transformed (IAAFT) Surrogates
To generate IAAFT surrogates, we first store a sorted list of the values of the time series in question and of the squared amplitudes of its Fourier transform:
Then continue with a random shuffle of the data, , producing
and iterate thereof. Each iteration consists of two steps:
(1) | Bring ![]() ![]() ![]() |
(2) | Rank-order the resulting series of step (1) according to ![]() ![]() |
Summary
By necessity, the null hypothesis dictates the type of surrogate data needed in order for testing to be valid. For each of the mentioned null hypotheses, at least one corresponding surrogate producing technique was mentioned according to preferences in the related literature.
The following table shows original series statistics and how they are preserved in the surrogates according to the type of algorithm used for their generation:
Surrogate Type |
Mean |
Variance |
Empirical Distribution |
Autocorrelation Function |
Disadvantages |
Random Shuffle |
v |
v |
v |
x |
Destroys linear correlation |
FT |
v |
v |
x |
v |
Spurious high/low frequencies |
AAFT |
v |
v |
v |
≈ |
Bias to flatter spectrum |
Fourier Shuffle |
v |
v |
v |
≈ |
Bias to flatter spectrum |
IAAFT |
v |
v |
v |
v |
Computational Complexity |
v = preserved, x = not preserved, ≈ = approximate preservation.
References
Refer to Theiler, Eubank, Longtin, Galdrikian & Farmer (1994), Theiler & Prichard (1996), Schrieber (1999), Schreiber & Schmitz (1996), Schreiber & Schmitz (1996a) and Schreiber & Schmitz (2000) for additional information on this subject.