Surrogate Production Methods

Different non-parameterized methods for generating surrogate data are found throughout related literature that attempt to preserve characteristics of an observed time series that are desired for test against some predefined null hypothesis in the so called Method of Surrogates. Throughout the literature, the favored null hypotheses for such tests were:

To generate data suitable for testing the null hypothesis (1) above, it is sufficient to realize random shuffles of the original time series since no temporal structure exists (i.e. the autocorrelation function does not display significant values) and preservation of the hypothetical typifying statistics, namely the first two moments (the mean and the variance), as well as the empirical distribution, is accomplished by such.

If the observed time series is characterized only by linearity, null hypothesis (2), then it is desirable to preserve the autocorrelation function (i.e. the frequency power spectrum) of the original time series in the surrogates. Accordingly, this task is accomplished by an independent uniform randomization of the frequency domain phases of the original time series with frequency symmetry preservation in order to allow inversion to a real-only surrogate series with no imaginary components.

In regards to null hypothesis (3), not only is the preservation of the autocorrelation function of the original time series desired, but also its empirical distribution, along with any possible discretization embedded within the original series. For this purpose, a new normal-distributed series (agreeing in rank with the original series) is realized from which a surrogate is created following the frequency domain phase randomization procedure explained above. The surrogate thus generated, serves as the rank order for a controlled shuffle in the original time series.

In addition to the previous method for testing against null hypothesis (3), another algorithm for surrogate data generation has been proposed from the idea of controlled shuffling and so called iterated and corrected amplitude adjusted Fourier transformed surrogates. In this case, surrogates are generated following the frequency domain randomization procedure directly from the original time series, but this time, only the rank-order of the surrogate is used as the control for a shuffle of the original time series to produce the desired surrogate.

Finally, in an attempt to correct the caveats of the Amplitude Adjusted Fourier Transform algorithm, one more algorithm has been proposed, the Iterated Amplitude Adjusted Fourier Transform, which will be briefly described in this page.

These are surrogate series generated by random shuffles of the original time series. It is accomplished by realizing time order shuffles from the original time series.

These surrogate series are generated by randomizing the frequency domain phases of the original time series. Specifically achieved by Fourier-transforming the original series into the frequency domain and then multiplying the phase at each frequency by

, where

is a uniformly distributed random number in the range

in such a way as to preserve symmetry,

. The inverse transform of these realizations are the desired target surrogates.

Amplitude Adjusted Fourier Transformed (AAFT) Surrogates (also known as Gaussian-scaled Surrogates)

These surrogate series are generated by a controlled shuffle of the original series based on the phase-randomized surrogate of rank-ordered Gaussian realizations. The rank ordering on the Gaussian realization takes place in agreement to the original time series rank-order. Then a FT surrogate is generated from the rank-ordered Gaussian realization, which is finally used for a controlled shuffle of the original time series. The original times series thus shuffled become the desired surrogates.

In distinction from the Gaussian-scaled surrogate method mentioned previously, the generation of these surrogates consists of rank-ordering the original time series from its direct FT surrogates.

Similar to the FT surrogate method in most, except that this method requires first the original time series to be multiplied by an endpoint-vanishing function

. The rational behind this filter lies in the fact that artificial high frequencies may be introduced in the process of generating the FT surrogates, which filtering seems to suppress, vanishing discontinuity jumps at the endpoints of the original time series. A side effect, though, is the introduction of a spurious low frequency component from the power spectrum of

itself. Finally, the WFT algorithm may be used as a replacement of FT in AAFT and in Fourier Shuffled productions.

To generate IAAFT surrogates, we first store a sorted list of the values of the time series in question

and of the squared amplitudes of its Fourier transform:

Then continue with a random shuffle of the data,

, producing

and iterate thereof. Each iteration consists of two steps:

By necessity, the null hypothesis dictates the type of surrogate data needed in order for testing to be valid. For each of the mentioned null hypotheses, at least one corresponding surrogate producing technique was mentioned according to preferences in the related literature.

The following table shows original series statistics and how they are preserved in the surrogates according to the type of algorithm used for their generation:

Surrogate Type	Mean	Variance	Empirical Distribution	Autocorrelation Function	Disadvantages
Random Shuffle	v	v	v	x	Destroys linear correlation
FT	v	v	x	v	Spurious high/low frequencies
AAFT	v	v	v	≈	Bias to flatter spectrum
Fourier Shuffle	v	v	v	≈	Bias to flatter spectrum
IAAFT	v	v	v	v	Computational Complexity