The idea that variations among mental measures are due to random error is as old as the idea of mental measurement itself. Both Fechner, in his psychophysical studies, and Ebbinghaus, in his study of memory, assumed that the observed score was composed of a true score and error. Both were aware of the work of Gauss and Laplace, and accepted the "normal law" as a "law of nature that applies whenever variability is uncontrolled" (Boring, 1950, p. 285). In this view, variations are produced by uncontrolled independent contributions to performance, distributed normally. For the early experimental psychologists, the ramification of this "normal law" was that the "true" score had to be estimated by calculating an average from a number of repeated measures. With the increase in statistical sophistication, variability was understood to influence statistical "power"; larger variability meaning less power. So variance became a nuisance, a potential source of Type II errors. Procedural and statistical techniques became necessary in order to minimize, or "account for", variance.A nonlinear dynamics perspective, among other things, challenges this conception of variability. We now know that simple deterministic systems can behave in a way difficult to distinguish from random variations. So an observation of apparent randomness is no longer prima facie evidence for the "normal law". As pointed out some time ago by Fred Abraham (Abraham, Abraham, & Shaw, 1990), rather than a "nuisance", variability becomes an interest in itself, a legitimate object of study.
But how is variability to be studied? Well, if we make a number of observations (and here we are talking about repeated measures on the same subject) we can treat the data as a time series. There are two kinds of techniques for studying time series, classical techniques, such as Fourier analysis, arising from the traditional linear perspective, and newer nonlinear techniques. A Fourier analysis, such as FFT, transforms the time domain into a frequency domain, and examines the data for periodicity. If the series is periodic, then the resulting power spectrum should reveal peak power at the driving frequency. What do we expect if the variability is due to independent random samples from some probability distribution? The power spectrum should have no slope. All frequencies should be equally evident, the noise should be "white" noise.
In a study reported this year by Gilden, Thornton, and Mallon (1995), a new finding emerged when cognitive data were analyzed in this way. Gilden et al., were interested in the errors subjects make when estimating time intervals. The question was: How do these time-estimate errors vary. Not simply: How are they distributed, collapsed over time, but How do they vary in time? Subjects were first presented an example of the target temporal interval and then made a series of estimates, without feedback, by tapping on the space bar of a keyboard. The errors in these successive estimates were then treated as a time series. Each subject produced a number of series, estimating intervals from 10 seconds to 0.3 second (10, 5, 1.5, 1, .5, & .3 seconds).
Those of you who attended last year's Chaos meeting may notice the similarity between the procedures of Gilden et al. (1995) and those used by Ward (1994), who reported a similar study in Baltimore. Ward also had subjects estimate temporal intervals and treated the data as a time series. The difference between the two studies is in how the data were treated. Ward used nonlinear methods designed to distinguish random data from chaos, whereas Gilden used the more traditional, linear, Fast Fourier Transform (FFT) technique. Ward concluded somewhat equivocally that the data might be the product of a nonlinear system, although not a low- dimensional one.
The first thing apparent about the Gilden et al. (1995) data is that they do not vary as white noise, at least not for the longer intervals. White noise would produce a power spectrum, log power plotted against log frequency, that would have zero slope. Instead the slopes appear to equal 1/f. A line with slope 1/f is drawn separately and it seems to fit the data well, at least for the longer intervals. The data for intervals less than about 1.5 seconds seems to have 1/f slope for the lower frequencies only. At about 0.2 Hz the functions flatten out or turn up.
Gilden et al. developed a model to account for their results. The model assumes that the production of a temporal estimate has two components, an internal clock that mediates the judgment of time passage and a motor component responsible for executing the response. The first component is assumed to be subject to 1/f noise, while the motor component is assumed to produce white noise. The shorter the time estimate, the greater the relative contribution of the motor component. The theoretical fits are accomplished by allowing one parameter, which reflects the relative contribution of the two errors, to vary. Reasonably, the shorter the interval the greater the contribution of the motor component.
It is probably not too strong to say that Gilden et al. discovered 1/f noise in cognitive data. Why is this a notable finding? Because of the ubiquity of 1/f noise (or "flicker" noise, or "pink" noise, as it is sometimes called). As Gilden et al. note, 1/f noise appears throughout physics; in conductors, lasers, magnets, and optics. In fact, 1/f noise appears through the whole range of physical systems; from the quantum level to the behavior of quasars. In addition, 1/f noise has begun to be reported in biological systems (Handel & Chung, 1993, Anderson, Holroyd, Bressler, Selz, Mandell, & Nakamura, 1993).
The fact of its universality "suggests that it does not arise as a consequence of particular physical interactions, but instead is a general manifestation of complex systems. " (Gilden, et al., 1995, p. 1837). And its discovery in a cognitive task, Gilden et al. argue, offers "the possibility that cognition has formal or physiological organizations that are common to complex dynamical systems." (p. 1839).
Bak (1990) has argued that 1/f noise is symptomatic of "self- organized criticality", by which he means a system operating near a threshold of instability. We can illustrate this using a system familiar to us all, the logistic difference equation. The next figure is the familiar bifurcation diagram for the logistic map,
In the midst of the chaos on the right, are seen regions of stability, the most prominent being the 3-point attractor located at about r = 3.83. We select a value of r just short of (on the edge of, at the threshold of) the beginning of that attractor (just short of 3.8284), and initiate the map. What kind of time series do we get? A part of it is shown in the next figure.
We see the system alternates between periods during which it is captured by three points and periods when the behavior is erratic. An FFT of this time series should reveal peaks at high frequencies corresponding to the regular cycling through the three points. But what about the rest of the data? An FFT of this time series (about 3000 points) is shown in the next figure.
On the right the side of the figure, in the high frequencies, we see the expected peak from the three-point cycles. On the left, presumably contributed by the erratic periods in the time series, we find a negatively sloped band of power. The best fit straight line through this portion of the power spectrum has a slope near -1. Tickling near a three-point attractor, the logistic map coughs up some pink noise.Subsequent to the Science report, Gilden informally notified us that he has examined time series in more standard cognitive tasks, including the lexical decision task and numeric comparison tasks. Preliminary results have suggested to him that the slope of the power spectra depends on the complexity of the task. More generally, the suggestion is that the slope may be described as 1/(f^a), where f is raised to some fractal value, a, near one. Upon receiving his report, we turned to a similar manipulation in a study we had recently completed but had analyzed from a nonlinear perspective (Clayton & Frey, 1995). The conclusion that we had reached was similar to Ward's conclusion following nonlinear analyses of his time estimation data, namely that the data might be deterministic, but they were not low dimensional. It is the re-analysis of those data that constitutes our major empirical report here, to which we now turn.
We had given twelve subjects three different classification tasks, designed to manipulate memory load. On each trial of each task the subject was presented either an X or an O. The tasks differed only in the response rule. In the X-O task, the subject pressed one key for an X, another for an O. This is a simple choice reaction- time task which is usually accomplished in a half-second or so. Thus, as in Gilden's short interval data, we suspected that the 'noise' in the motor output might contribute substantially to a spectral analysis.
The X-O task required very little memory. The subject simply needed to remember the response rule. In other next two tasks, however, the subject needed to remember stimuli presented earlier. In the same-different-1 task the subject pressed one key if the stimulus was the same as the stimulus on the previous trial, otherwise the other key. On the same-different-2 task, the subject pressed one key if the stimulus was the same as the stimulus two trials back, otherwise the other key. Each task had 500 uninterrupted trials, with task order counterbalanced across subjects. We used 500 trials because the planned nonlinear statistics require lengthy time series. Reaction time was measured on each trial.
The next figure displays the obtained data, using the standard plot of dependent variables vs. independent variables. As expected, average response time, plotted in milliseconds for the three tasks, increases significantly with increased memory load, being fastest for the simple X-O task, on average about one-half second, slowest for the task requiring memory for the last two trials, requiring on average about a second and a half.
That's the conventional analysis. Now we treat the data as a time series and examine the power spectra resulting from FFT analysis. Separate FFTs were performed on the three time series for each subject and the slopes were estimated for each power spectra. Overall, the results agreed with those of Gilden et al., in that the slopes were non-zero and negative. In fact, 33 of the 36 slope estimates (12 subject times 3 tasks) were negative. The slopes also depended on the task. This is shown in the next figure, which plots the average log(power) for each log(frequency) across the subjects separately for the three tasks.
This figure is "busy", so let's look at the power spectra for the three tasks separately, starting with the X-O task on the bottom (black plusses). This is the simplest task, which is accomplished on the average in about 500 milliseconds. Like Gilden's faster tasks, its power spectrum appears to have two components, a flat, white noise component for the high frequencies and a negative slope for the low frequencies. We have fit separate straight lines for the regions above and below log(frequency) equal to -1. The location of this elbow is about the same point as in Gilden's fits. Unlike Gilden's time estimation data, however, the slope for low frequencies is less than one, being equal to -.46 here. The slope is almost-pink.For the moment skipping over the middle points in the figure, at the top you will see the power spectrum for the task which required memory for stimuli two trials back (green circles). This was the slowest task, requiring on average 1500 msec. These data resemble those for the longer temporal estimates in Gilden's task. They seem well-described by a single line, little evidence is apparent for a white noise component. Again, however, the slope of the function is less than one (equaling -.255) but still significantly non- zero.
The data for the middle task (red diamonds) falls between the data of the other two. The overall slope is significantly non-zero, and there is a suggestion of two components, but note the bend is not as marked as for the X-O task. We have not bothered to fit two lines in view of the lack of clarity. Suffice it to say, the results are intermediate.
Our final data analysis uses a technique that we have adopted as a standard control analysis. We take each time series and shuffle it randomly. This procedure maintains the distributional properties of the data but destroys any trial-to-trial relations. In short, it should make white noise of our data. We then repeated the FFT analysis on the shuffled data. The results are shown in the next table, which reports the mean and standard deviation of the slopes of the power functions on the shuffled data. The fact that these data are indistinguishable from white noise gives us confidence that our earlier analyses are not artifactual. Shuffled Average Slope Std. Dev.
X-O -.010 .064
Same-Diff-1 -.042 .100
SameDiff-2 -.068 .107
Table 1. Mean (and standard deviation) slopes of power spectra for the three tasks (shuffled data).What do we conclude so far? Our analyses confirm the utility of applying FFT to time series produced by repeated measures on cognitive tasks. The results are inconsistent with the conventional view that variability in repeated measures is due to a large number of independent random variables, that is, to white noise. Like Gilden et al., we found a white noise component in the power spectrum for the task with short response times. But all tasks also have a noise component with slope between 0 and -1, between white and flicker noise. We also found, for certain values of the control parameter, that the logistic map produces time series that contain a white noise component. All of this encourages us to continue the consideration and development of deterministic models to account for trial-to-trial variability in cognitive tasks, at least for tasks producing relatively long response times.
As an initial step in this direction we have begun to study power spectra for the logistic map at different values of the control parameter r. Our report ends with a very preliminary, but to us provocative, finding. Consider first the chaos produced when r=4. We already know that an FFT should be unable to distinguish this chaos from white noise. This point is shown in the FFT we performed on a time series of 3000 trials.
As expected, the power spectrum has essentially zero slope. But suppose we back off this chaotic end and examine r=3.97. The resulting time series (3000 trials), and histogram of values, resembles that produced by r=4. But its power spectrum does not, as shown next.
What to us is provocative about this spectrum is the reappearance of the broad band, almost-pink, noise in the lower frequencies. The ease with which we found this power spectrum, which superficially resembles both the two-component memory-load data spectra and the "edge of periodicity" spectra, suggests that it appears for many values of r. But we do not yet know. In any case, it seems clear that we need to know more about the presence and kind of noises in cognitive and other deterministic systems.
References Abraham, F. , Abraham, R., & Shaw, C. D. (1990). A visual introduction to dynamical systems theory for psychology. Santa Cruz, CA: Aerial Press, Inc.Bak, P. (1990). Self-organized criticality. Physica A, 163, 403-409.
Boring, E. G. (1950). A History of Experimental Psychology. New York: Appleton-Century-Crofts.
Clayton, K. & Frey, B. (May, 1995). A Nonlinear Dynamics Approach to Studying Short Term Memory. Paper presented to the Annual Convention of the Midwestern Psychological Association. Chicago, IL.
Gilden, D. L., Thornton, T., & Mallon, M. W. (1995). 1/f noise in human cognition. Science, 267, 1837-1839.
Handel, P. H. & Chung, A. L. (Eds.) (1993). Noise in Physical Systems and 1/f Fluctuations. AIP Conference Proceedings 285. New York: American Institute of Physics.
Ward, L. M. (June, 1994). Hypothesis testing, nonlinear forecasting, and the search for chaos in psychophysics. Presented at the Fourth Annual International Conference of the Society for Chaos Theory in Psychology and the Life Sciences. Baltimore, MD.