Surrogate Production Methods

Top  Previous  Next

 

Introduction

 

Different non-parameterized methods for generating surrogate data are found throughout related literature that attempt to preserve characteristics of an observed time series that are desired for test against some predefined null hypothesis in the so called Method of Surrogates. Throughout the literature, the favored null hypotheses for such tests were:

 

(1)The observed time series is a realization of a random process.
(2)The observed time series is characterized by linearly autocorrelated Gaussian noise.
(3)The observed time series is the result of nonlinear, monotonic and invertible instantaneous filter acting on a linear stochastic time series.

 

To generate data suitable for testing the null hypothesis (1) above, it is sufficient to realize random shuffles of the original time series since no temporal structure exists (i.e. the autocorrelation function does not display significant values) and preservation of the hypothetical typifying statistics, namely the first two moments (the mean and the variance), as well as the empirical distribution, is accomplished by such.

 

If the observed time series is characterized only by linearity, null hypothesis (2), then it is desirable to preserve the autocorrelation function (i.e. the frequency power spectrum) of the original time series in the surrogates. Accordingly, this task is accomplished by an independent uniform randomization of the frequency domain phases of the original time series with frequency symmetry preservation in order to allow inversion to a real-only surrogate series with no imaginary components.

 

In regards to null hypothesis (3), not only is the preservation of the autocorrelation function of the original time series desired, but also its empirical distribution, along with any possible discretization embedded within the original series. For this purpose, a new normal-distributed series (agreeing in rank with the original series) is realized from which a surrogate is created following the frequency domain phase randomization procedure explained above. The surrogate thus generated, serves as the rank order for a controlled shuffle in the original time series.

 

In addition to the previous method for testing against null hypothesis (3), another algorithm for surrogate data generation has been proposed from the idea of controlled shuffling and so called “iterated” and “corrected” amplitude adjusted Fourier transformed surrogates. In this case, surrogates are generated following the frequency domain randomization procedure directly from the original time series, but this time, only the rank-order of the surrogate is used as the control for a shuffle of the original time series to produce the desired surrogate.

 

Finally, in an attempt to correct the caveats of the Amplitude Adjusted Fourier Transform algorithm, one more algorithm has been proposed, the Iterated Amplitude Adjusted Fourier Transform, which will be briefly described in this page.

 

Random-Shuffled Surrogates

 

These are surrogate series generated by random shuffles of the original time series. It is accomplished by realizing time order shuffles from the original time series.

 

Fourier Transformed (FT) Surrogates (also known as Phase-randomized Surrogates)

 

These surrogate series are generated by randomizing the frequency domain phases of the original time series. Specifically achieved by Fourier-transforming the original series into the frequency domain and then multiplying the phase at each frequency by image002ao, where image004ao is a uniformly distributed random number in the range image006ao in such a way as to preserve symmetry, image008ao. The inverse transform of these realizations are the desired target surrogates.

 

Amplitude Adjusted Fourier Transformed (AAFT) Surrogates (also known as Gaussian-scaled Surrogates)

 

These surrogate series are generated by a controlled shuffle of the original series based on the phase-randomized surrogate of rank-ordered Gaussian realizations. The rank ordering on the Gaussian realization takes place in agreement to the original time series rank-order. Then a FT surrogate is generated from the rank-ordered Gaussian realization, which is finally used for a controlled shuffle of the original time series. The original times series thus shuffled become the desired surrogates.

 

Fourier Shuffled Surrogates

 

In distinction from the Gaussian-scaled surrogate method mentioned previously, the generation of these surrogates consists of rank-ordering the original time series from its direct FT surrogates.

 

Windowed Fourier Transformed (WFT) Surrogates

 

Similar to the FT surrogate method in most, except that this method requires first the original time series to be multiplied by an endpoint-vanishing function image010ao. The rational behind this filter lies in the fact that artificial high frequencies may be introduced in the process of generating the FT surrogates, which filtering seems to suppress, vanishing discontinuity jumps at the endpoints of the original time series. A side effect, though, is the introduction of a spurious low frequency component from the power spectrum of image012ao itself. Finally, the WFT algorithm may be used as a replacement of FT in AAFT and in Fourier Shuffled productions.

 

Iterated Amplitude Adjusted Fourier Transformed (IAAFT) Surrogates

 

To generate IAAFT surrogates, we first store a sorted list of the values of the time series in question image014ao and of the squared amplitudes of its Fourier transform:

 

image016ao

Then continue with a random shuffle of the data, image014ao, producing image018ao and iterate thereof. Each iteration consists of two steps:

 

(1)Bring image020ao to the desired power spectrum by taking its Fourier transform, replacing the squared amplitudes image022ao by image024ao and then transform back.

 

(2)Rank-order the resulting series of step (1) according to image014ao, which modifies the spectrum of the result, image026ao. Therefore, the two steps have to be repeated several times. At each iteration, we can check the remaining discrepancy of the spectrum and iterate until a given accuracy is reached. Eventually, the transformation towards the correct spectrum will result in a change which is too small to cause reordering of the values.

 

Summary

 

By necessity, the null hypothesis dictates the type of surrogate data needed in order for testing to be valid. For each of the mentioned null hypotheses, at least one corresponding surrogate producing technique was mentioned according to preferences in the related literature.

 

The following table shows original series statistics and how they are preserved in the surrogates according to the type of algorithm used for their generation:

 

Surrogate Type

Mean

Variance

Empirical Distribution

Autocorrelation Function

Disadvantages

Random Shuffle

v

v

v

x

Destroys linear correlation

FT

v

v

x

v

Spurious high/low frequencies

AAFT

v

v

v

Bias to flatter spectrum

Fourier Shuffle

v

v

v

Bias to flatter spectrum

IAAFT

v

v

v

v

Computational Complexity

v = preserved, x = not preserved, = approximate preservation.

 

References

 

Refer to Theiler, Eubank, Longtin, Galdrikian & Farmer (1994), Theiler & Prichard (1996), Schrieber (1999), Schreiber & Schmitz  (1996), Schreiber & Schmitz  (1996a) and Schreiber & Schmitz (2000) for additional information on this subject.