Independent Sampling vs. Inter-Item

Dependencies In Whole Report Processing: Contributions of Processing Architecture And Variable Attention

Thomas A. Busey

and

James T. Townsend

Indiana University

 

 

 

Please send correspondence to: Thomas A. Busey

Department of Psychology

Indiana University

Bloomington, IN 47405

email: busey@indiana.edu

 

Abstract

All current models of visual whole report processing assume independence among the displayed items. This assumption combined with exponential growth curves forms what we term the Modal Model Class. A recent example is the Independent Sampling Model of Loftus, Busey & Senders (1993). The fundamental independence assumption has only been tested once before (Townsend, 1981) where tests revealed no dependencies except those produced by guessing. The present study provides new tests of the independence assumption, and finds significant positive dependence, with the degree of positive dependence growing as more other items that have been correctly reported on a given trial are conditioned on. Poisson models predict a positive dependence and we develop a succinctly parameterized version, the Weighted Path Poisson Model, that allows the finishing order to be a weighting probabilistic mechanism. However, it does not predict the data quite as well as one, the Variable Attention Model, that allows independence within trials (unlike the Poisson models). This model assumes that attention (or potentially, other aspects such as signal quality) varies widely across trials, thus predicting an overall positive dependence. Intuitions for and against the competing models are discussed and it is shown, through mimicking formulae, that models that contain the proper qualitative type of dependence structure can be cast in either serial or parallel form.

The advent of the information processing approach in the late 1950’s and 1960’s led not only to the isolation and analysis of new processing subsystems and studies of their action and interaction, but also to the development of mathematical models of a variety of these systems and mechanisms. One of the earliest and most prominent to receive attention was the short-term memory paradigm known as "whole report" or "full report", in which the subject was instructed to report verbally as many unrelated items as she could from a visual display. Sperling’s (1960) classic investigations combining his partial report paradigm with various manipulations and analyses on that and whole report data, separated the early visual information storage (to be known as the icon) from subsequent short-term memory and rehearsal mechanisms. The whole report paradigm has continued to be important in its own right and models of this stage have been concatenated with other perceptual and cognitive processes.

Although early general opinion seemed to be that perception of the displayed items in whole report took place in a serial manner, Sperling himself (e.g., 1967) and most later investigators came to favor a parallel interpretation in the perceptual stages of processing. A notable early example of this was Rumelhart’s Multicomponent Model (1970) which accommodated many of the basic whole report phenomena but also was able to predict a number of other associated aspects of whole and partial report. Within the whole report paradigm, the Multicomponent Model predicted independence of item processing and consequent independence of performance on the separate items. However, that important assumption was not tested at the time.

Since that time, a number of other models of whole report behavior have appeared. All of the quantified models with which we are familiar, outside some of the models tested by Townsend (1981), assume independence in the early stages, on the separate displayed items. Further, all except the Multicomponent Model also assume that probability correct at any signal location is proportional to an exponential function. We have to note that an exponential growth law of this type is not equivalent to assuming the exponential distribution, which will be exhibited below. Examples of members of the class include models by Massaro (1970), Townsend, (1981), Shibuya & Bundesen (1988), Gegenfurtner & Sperling (1993), and Loftus, Busey & Senders (1993). Because of their shared characteristics, we refer to all models based on independence and exponential growth as members of the Modal Model Class. They are not the same model, because each was based on somewhat different additional assumptions. Space precludes detailed treatment of all these models, but below we briefly describe how Townsend's Bounded Performance Model embodies the modal assumptions as well as how it deals with issues of capacity. In this model, it can be seen in what sense exponential growth without an explicit exponential distribution can appear. Another version of this concept will be found below in the description of the Loftus et al. (1993) Independent Sampling Model. In addition, it is of general interest to observe a way in which bounds on performance, apparently needed by data in experiments with five or more letters, can be incorporated in models, and that independence can be satisfied even with a bound of this nature (for more discussion on capacity and its relation to the dependence issue, see Appendix A).

Basically, the Bounded Performance Model (Townsend, 1981) posited that growth of the available stimulus information I(t) proceeded according to the elementary differential equation dI/dt = [I-I(t)]V, with the solution, I(t) = I{1-exp[-(t-to)/V]}, where , i.e. the asymptotic level of information. "V" is the rate of growth depending on stimulus parameters and the subject's sensitivity, and to is the 'lift-off' time at which information begins to be accumulated. It was assumed that the subject has available an amount of capacity "C" that is allocated, perhaps unevenly to the variable n displayed items. From results of experiments that vary the display size (i.e. processing load), we would predict that C would generally not expand proportional to that size, demonstrating that processing is limited capacity. However, Townsend (1981) and the present experiments hold constant the load. The parameter ai represented proportion of capacity devoted to item location i and it is postulated that . Accuracy is governed by P(cor on Location I at time t>= to) = P(Ci, t) = Min (I(t)/I * ai * C, 1)= Min {ai C [1-exp[-(t-to)V]],1}. Thus, probability correct is basically the product of the allocated capacity and the available stimulus information. Note that the expected number correct (assuming simplicity, that none of the positions are perfect) equals with This bound does not imply any kind of positive or negative dependence.

In contrast to the Bounded Performance Model, the Multicomponent Model predicted that performance would ultimately increase to perfection as duration increased. Performance in a five-consonant whole-report design in the Townsend study was strongly bounded and could not be captured by the Multicomponent Model. However, Townsend’s analyses strongly supported independence of processing, in particular over both a fixed size buffer model (the Fixed Sample Size Model), which predicted a negative dependence and a Poisson visual-to-phonological translation model which predicted a positive dependence.

As another model included in the Modal Model Class, the Fixed Capacity Independent Race Model of Shibuya & Bundesen (1988) is based on independent exponential random variables. Thus, accuracy can, in principle, approach perfection as t gets large. However, in that model, completed items are delivered to a fixed storage capacity memory, apparently identical to that tested by Townsend (1981). The latter predicts negative dependence as noted above. A later version (Bundesen, 1990), allows a mask to terminate processing and potentially lead to an (imperfect) upper bound on performance, even without the subsequent fixed capacity memory. However, this model still predicts negative dependencies, if the input to the storage buffer exceeds its fixed capacity.

The member of the Modal Model Class on which we focus in this paper was developed by Loftus and his colleagues in several recent articles (Loftus, Duncan, & Gehrig, 1992; Loftus & Busey, 1992; Loftus & Ruthruff, 1993; Loftus, Busey & Senders 1993; Busey & Loftus, 1994). They proposed the Independent Sampling Model of visual information processing to account for performance in a four-item whole report task. Recently, that model has been expanded to include linear filtering mechanisms that address the nature of the initial time-varying sensory representation from which information is extracted. However, it is the model's original form that we are most interested in, and below we describe its major assumptions.

The Independent Sampling Model (also termed the Random Sampling Model) describes information processing as the acquisition of stimulus features through a process that randomly samples features from the stimulus with replacement, at a rate that remains constant through the stimulus display. This time-invariant rate represents the raw feature-sampling rate. Because features are sampled with replacement, the rate of acquisition of new features is proportional to one minus the proportion of already-acquired features. These acquired features subsequently form the basis for the sensory representation. Because sampling occurs with replacement from the features in the display, the result is a theory that predicts that the rate of acquisition of new features is exponentially distributed. If sampled in discrete time, the distribution would be geometric, an approximation to the exponential. However, if the sampling occurs in continuous time according to a Pure Death Process, the expected proportion of features sampled is exactly exponential (see, e.g., McGill, 1963, pp. 343-346). Now let X = {0,1} be a random variable designating incorrect (X=0) or correct (X=1). The random variable notation will be useful for later formulas. This conceptualization, with an additional assumption relating the proportion of acquired features to proportion of correctly-recalled items, relates stimulus duration to performance on each item via the expression:

P(Correct) = P(C) = P(X=1) = (1.0 - e-(d-L)(1/c))+ g(e-(d-L)(1/c)) Eq. 1

where P(C) represents proportion-correct performance, g represents the guessing rate, d represents stimulus duration, 1/c represents the raw stimulus rate, L (for "Liftoff") represents the pre-processing delay, and the exponential captures the sampling-with-replacement property of the independent sampling model.

Note that the form of exponential growth here is analogous to Townsend's Bounded Performance Model, not an exponential distribution in the usual sense. The first part of Eq 1 represents the probability of obtaining the digit via the random sampling mechanism, while the second represents the probability that the item is correctly guessed if it is not obtained from the random sampling mechanism. The guessing rate g equals 1/10, since in the study all stimulus items are sampled with replacement from the ten digits. Although the notion of a bound on performance was not an integral part of this model, some provision was permitted for that due to ancillary disturbances, such as eye movements, lapse of attention etc.. Observe that Eq. 1 does not exhibit position effects, but that characteristic is easily relaxed as in Townsend's (1981) model.

At least as important as the exponential growth assumption in the Modal Class is the axiom of independence. Investigation of the dependency structure in a task takes us a step deeper into the architecture of processing mechanisms and increases our understanding of the details of the time-dependent processing characteristics. For instance, continued findings of independence (as in Townsend, 1981) argue not only for separate independent channels of perceptual processing, but also indicate that subsequent stages (in parallel or in serial) of processing do not seriously compromise that independence. Furthermore, negative or positive dependencies could lead to refined notions not only of correlations in the actual perceptual channels but possibly in the architecture and dynamisms of later translation, memory and output stages.

For these reasons, we decided to analyze data previously collected by Loftus, Busey & Senders (1993) to see if the independence findings of Townsend (1981) would be replicated, and if not, what sort of relatively simple model might explain the dependencies. In the Loftus et al. (1993) study, subjects viewed briefly-presented stimuli consisting of 4 randomly-chosen digits (chosen with replacement) presented using a slide-projector based laboratory. The digits were low contrast (~7.5%) and viewed under mesopic luminance conditions. The digits were shown for one of 8 stimulus durations ranging from 40 to 200 ms, and were immediately postmasked by a pattern mask. The subject typed in the 4 digits on a keypad, starting from the left-most digit and working to the right. If an item was correct but in the wrong location, it was counted as wrong. If a digit was not perceived, the subject was instructed to guess. The primary dependent measure was the proportion of correctly-recalled digits for each stimulus duration, but, as we will demonstrate below, other statistics can be computed.

Performance in the whole-report task is computed as the number of correctly recalled digits, and averaged within each stimulus duration. These values are plotted against stimulus duration to get a marginal performance curve. Figure 1 shows data from three observers in the original Loftus, Busey & Senders (1993) Experiment 1. The long-dashed curves represent the Independent Sampling Model prediction for this task, while the short-dashed curve represents the prediction from a particular model based on the Poisson distribution, to be described below. Estimated parameter values for these and other models are found in Table 1.

Insert Figure 1 and Table 1 About Here

We should note that attention does not vary across trials in the current version of the Independent Sampling Model. However, in a subsequent section we are forced to add this assumption to the model.

The remainder of the article is organized as follows. The hallmark of the independent sampling assumption is that there are no inter-item dependencies; that is, the probability of acquiring one item is independent from the probability of acquiring some other digit. We demonstrate that the data contain positive dependencies, which are inconsistent with the independent sampling assumption and models that predict negative dependencies, such as the Shibuya & Bundesen model (1988; Bundesen, 1990) or the fixed size buffer model of Townsend (1981). To account for these dependencies, we develop and extend a class of poisson models that can also account for the marginal data shown in Figure 1. We then demonstrate how one of the Poisson Model provides a better account of these data than the Independent Sampling Model. Finally, we show how a variant of the Independent Sampling Model that assumes that attention varies across trials can, by assuming a very large variability in attentional (and/or stimulus quality) states, also account for the positive dependencies. This account, as we will see, relies on rather extreme and apparently independent shifts in attention that seem contrary to the traditional view of attention varying gradually across trials.

Evidence for Inter-Item Dependencies

The Independent Sampling Model is a parallel model in which information is sampled from the display at random, with replacement, and at some rate 1/c. As a result of the parallel independent nature of the model, it predicts no dependencies between items from different columns. For example, the probability of getting the first digit correct is independent of getting the second digit correct. Evidence about such dependencies allow much stronger tests of models than the marginal data seen in Figure 1. A first step, therefore, is to determine whether inter-item dependencies exist in the data, as described below. If we find evidence for dependencies, these can be further explored using several statistics described in subsequent sections. Finally, the Independence and Poisson models will be tested using traditional log-likelihood analyses.

Independence Analyses

A non-parametric test of the dependencies between items is described by Kullback (1968), which provides for tests of independence within a single stimulus duration but makes no assumptions about the growth curve. Kullback (1968) computes an Independence statistic that is similar to a Chi-Square test, which places the result of each trial in one of 16 cells according to whether the digit in each of the four positions was correct or incorrect. These counts are indexed as xijkl where i through l index positions 1 through 4 of the 4 digits. For example, suppose the observer produces a result of {0, 1, 1, 1}. Thus i = 0 and j..l = 1. For this trial we would increment x0,1,1,1 by 1 to reflect this outcome. The independence statistic is:

Eq. 2

where xi...is the sum of the correct and incorrect digits in positions 2-4 for a given state of i (correct or incorrect). The other terms represent similar marginal statistics for the other positions. N represents the total number of trials in the experiment at a given exposure duration. This statistic is distributed as Chi-Square with 11 degrees of freedom for the 4 digit stimuli. The critical value for a = 0.05 is 19.67.

Such a statistic must be computed separately for different stimulus durations, since dependencies will be produced if trials are mixed from different exposure durations. This has the disadvantage, common among chi-square-type analyses, that a marginal in the denominator of Eq. 4 might be zero, making the term undefined. When this occurred we removed this term from the summation. This has the effect of reducing the Independence statistic from its true value, which makes it more conservative since it must exceed the critical value in order to demonstrate evidence of dependencies.

Table 2 lists the computed independence values for the three observers. Values that exceed significance are listed in bold type.

Insert Table 2 About Here

The independence statistics in Table 2 demonstrate significant inter-item dependencies for all three observers. Short stimulus durations produce chance performance and thus no dependencies are to be expected, while Observer EF's performance is near 1.0 for long stimulus durations, making inter-item dependencies impossible. However, at moderate stimulus durations (relative to each observer) we see clear evidence of inter-item dependencies.

A disadvantage of the independence statistic is that it does not directly reveal the direction of the dependencies or give much sense about what is happening to the dependencies as, say, display duration varies or the number of other items correct increases in the joint probability. Two statistics that partially fulfill this need are described below. These statistics are intended to provide an intuitive summary of the dependency structure, and in a subsequent section we will provide log-likelihood tests that exhaustively test the dependencies in the data. These latter tests will provide the best techniques for discriminating between models.

Alternative Candidate Models

We consider two alternative classes of models, that is, opposing the independence assumption, that might account for the dependencies found in the previous analyses. The first class, drawn from the Poisson family of distributions, is typically interpreted as a serial mechanism in which attention moves a single processing mechanism in some fixed serial order through the processing locations. Although originally suggested as a serial translator from iconic visual information storage to an acoustic buffer (Townsend, 1981; as suggested qualitatively by Sperling, 1967), we will see it can also accommodate parallel interpretations. A second type of model, our example being the Variable Attention Model, extends the parallel processing nature of the Independent Sampling Model, by assuming that either the processing capacity (as in attentional states) or the quality of the incoming information, varies across trials. At the beginning of a trial, a sample is taken from the distribution on capacity, which then stays fixed throughout that trial. That amount of capacity can then be distributed, perhaps in an uneven manner, across the various item locations. However, it is assumed that the ratios of capacity across the item locations are constant across trials, implying that the sample capacity on that trial simply scales up or down the average capacity at each location. For instance, if the observer is in a high attentional state for a given trial, processing will speed up at all four stimulus locations by an amount that is proportional to their average processing rate.

While these models may be given different architectural interpretations, one way to think about the models is in terms of how attention varies during processing. In the original Independent Sampling Model, attention is fixed across trials, but can vary for the different stimulus locations. The Poisson models can be thought of as attention moving a single process from one location to the next in a serial fashion (see Appendix B for another interpretation). In addition the Variable Attention Model views capacity as a resource (attention, stimulus quality, etc.) that varies across trials, but remains fixed within a trial. A summary of the models and their underlying assumptions is found in Table 3.

Insert Table 2 About Here

Below we develop the Fixed Path Poisson Model, and then demonstrate how it can account for several conditional probability statistics that indicate the existence of positive inter-item dependencies. We anticipate that the distribution on processing paths, in this model, has no effect on dependencies. This is because what matters is the overall production rate of item completions, not the particular path taken (e.g. see Townsend & Ashby, 1983, pp. 68-76; also see Appendix A). In any case, these statistics are meant to provide some intuition concerning how various models provide positive or negative dependencies and in data, reveal not only the sign of dependence, but how the magnitude changes as a function of average accuracy (implicitly, as a function of exposure duration) and in one statistic, as a function of number of other items perceived correctly. A subsequent section relies on more traditional log-likelihood analyses to discriminate between models. The latter strategy is more sensitive to all dependencies but is not so adept at revealing the direction of dependencies or possessing a great deal of intuition.

The Fixed Path Poisson Model

Conceptually the Fixed Path Poisson Model is quite simple: upon stimulus onset, an observer begins viewing the first digit. The time to acquire enough information to correctly report a digit is exponentially distributed. Once the first digit is acquired, processing begins on the second digit, and continues through to the fourth digit. Processing ends upon stimulus offset or the completion of the fourth digit. If necessary, the observer then guesses the identity of any digits that have not yet been perceived.

The observer always moves from left to right, which identifies this Poisson Model as the Fixed Path Poisson Model. For instance, in some studies (e.g., Shibuya & Bundesen 1988) performance declines from left-to-right for horizontal linear arrays, although with larger arrays other factors such as lateral interference may introduce non-monotonicities (see Townsend, 1981). This type of result suggests the possibility of a serial mechanism with a preferred (perhaps fixed) order of processing, as instantiated in the Fixed Path Poisson Model. The strict left-to-right processing assumption is, of course, very strong, and by looking at the probability of a correct item beyond the first error, we will disconfirm the Fixed Path Poisson Model. In a later section, as noted, we develop a version of the Poisson model that allows the observer to take different paths through the four digits. Because extended models rely on the same basic structure as the Fixed Path Poisson Model, we develop this model below and extend it in a subsequent section. Appendix B discusses how the particular serial and parallel models under discussion can be mimicked by models of other architectures.

Townsend (1981) has derived predictions for the Fixed Path Poisson Model. The probability that a digit in the ith position is reported correctly is

, Eq. 3

where

Eq. 4

is the Poisson probability that j letters have been completed by time t. The parameter l is the poisson rate parameter. The parameter t0 is a processing delay parameter, similar to the Liftoff parameter of Eq. 1. The first sum in Eq. 3 is multiplied by 1/10 to account for unperceived digits that are correctly guessed. This model has two free parameters, l and t0, and when they are set to the best-fitting values shown in Table 1, they generate the short-dashed curves in Figure 1. The root-mean-squared-error (RMSE) was computed for the Independent Sampling Model and the Fixed Path Poisson Model to evaluate the goodness of each fit. Durations below the processing-delay parameter were not included in the evaluation of the models.

Each of the Independent Sampling Model fits in Figure 1 has a slightly larger RMSE than the associated Fixed Path Poisson Model fit. Based on the analysis of the marginal probabilities alone the Fixed Path Poisson Model gives a somewhat better account of the data. However, the fact that two quite different models should give such similar predictions gives credence to our suggestion that analysis of the marginal probability predictions may not always discriminate well between candidate models.

Conditional Probability Analysis

Our conditional probability statistics consider the probability that a digit is reported correctly, given that one or more other digits in the same trial are reported correctly. A positive dependency implies that knowing that a digit was correct increases the likelihood that another digit was also correct. A negative dependency implies that knowing that a digit was correct decreases the likelihood that another digit was also correct. Townsend (1981, 1983) has proposed two statistics that quantify the degree and direction of inter-item dependency present. These statistics do not require explicit parameter estimation. They compute two kinds of conditional probabilities that proved useful in the Townsend (1981) study for testing three principled models predicting qualitatively and quantitatively different dependencies. These summary statistics can provide an intuitive overview of the nature of the dependencies and we have found them helpful in model development. Although they do not exhaust all of the dependency information in the data, they complement the log-likelihood tests of the complete dependency set discussed in a later section. In addition, it is obvious that if a model cannot predict these summary dependency statistics, it will fail on the complete dependence analyses.

The first statistic is Ave[P(Ci|Cj)-P(Ci)] , i j, is computed and averaged over i and j for each separate exposure duration. The size of P(Ci|Cj) depends in part on the exposure duration and the observer's overall information acquisition rate. The parameters that vary accuracy in our studied models combine with the time variable in an inextricable manner. For example, no dependencies would be produced if performance was at chance or at ceiling. However, by plotting the curve of the above difference statistic as a function of time, we produce non-parametric curves and therefore parameter-free tests (since, in particular, our range of time durations went from chance to perfection). As stated earlier, exposure durations that generate near-chance or near-ceiling performance cannot show inter-item dependencies, while some models predict positive or negative dependencies for moderate exposure durations.

Derivations for the Fixed Path Poisson Model are found in Townsend & Ashby (1983). The Independent Sampling Model is straightforward: the defining characteristic of this model is that items in the display are acquired independently. Thus the Independent Sampling Model predicts Ave[P(Ci|Cj)-P(Ci)] to be zero for all values of i and j, and non-zero values of this statistic tend to disconfirm the model. Model predictions and dependencies from the three participants in Experiment 1 of Loftus, Busey & Senders (1993) are plotted in Figure 2. Each observer contributes one point to this graph for every stimulus duration that produces above-chance performance. Contrary to the predictions of the Independent Sampling Model, the data show clear positive dependencies for durations that produced above-chance performance. A sign test on the 16 points from the three observes reveals that 12 are above zero, which is significant at a = 0.05. These dependencies clearly reject the Independent Sampling Model. This is in contrast to the mild negative dependencies reported by Townsend (1981) for a similar task. These dependencies were predicted by the Bounded Performance Model to be due to guessing influences in the presence of the experimental sampling of stimuli without replacement (see Footnote 2). Individual values and predictions from the Fixed Path Poisson Model are found in Table 3.

Insert Figure 2 About Here

Insert Table 3 About Here

A second inter-item dependency statistic is P(Ci|k), the probability of getting item Ci correct given that exactly k other items were correct on that trial, where k ranges from 0 to 3 and i ranges from 1 to 4. These probabilities are averaged over the four serial positions (abbreviated as P(C|k)), and plotted against k. Separate curves are computed for each stimulus duration. These data are shown in Figure 3. The light curves represent the obtained performance, the dark curves are predictions from the Independent Sampling and Fixed-Path Poisson models. These theoretical predictions were generated using the best-fitting parameters from Figure 1 for each model and observer. Because the parameters from the models were fit to the marginal data, these inter-item dependency data predictions have no free parameters. Data and theoretical predictions are omitted where little data contribute to that point.

Insert Figure 3 About Here

The Fixed Path Poisson Model provides a much better fit to the data than the Independent Sampling Model. The Figure 3 data (solid curves) indicate the presence of some form of positive inter-item dependencies, which is again contrary to the little or no inter-item dependencies reported by Townsend (1981) beyond those accounted for by guessing. These dependencies clearly reject the Independent Sampling Model. The Fixed Path Poisson Model predictions come quite close to the overall pattern of data, for all three participants.

Four Way Joint Probabilities

The two conditional probability statistics described above demonstrate that the data contain positive dependencies that are consistent with the Fixed-Path Poisson Model. These statistics demonstrate the direction of the dependencies and by inspection produce good fits to the empirical data. The strongest test of the Independence and Poisson models comes from the model's ability to predict the four-way joint random variables [Xi] = (X1, X2, X3, X4)= (0000, 1000, 0100, ..., 1111), where Xi is the random variable at location i and a '1' signifies correct and '0' signifies incorrect. Four way joint probabilities can be computed from the response frequencies by dividing by the total number of trials at a particular stimulus duration. These four-way joint probabilities exhaust all dependency (and marginal) information in the data, and thus form an excellent statistic for model testing.

Table 4 contains the counts of the joint events for the three observers for the 8 stimulus durations and 16 response types.

Insert Table 4 About Here

Four-way joint probability predictions for the Fixed Path Poisson Model are straightforward to generate for all 16 response patterns. Under the fixed-path assumption, an item can only be obtained to the right of an error through the guessing process. For instance,

Eq 5

where is the probability of getting j items by the Poisson process at time t from Eq. 3, g is the guessing rate, and to is the pre-processing delay parameter. The entire expression in Eq 5 computes the probability of getting none of the digits by the Poisson process but guessing two correctly or getting the first digit by the Poisson process and correctly guessing the third digit. These two events are mutually exclusive and therefore sum to produce the overall joint probability.

A handy way to produce predictions in general for the Fixed Path Poisson Model is to designate the position just before an error, if any occurs, as r. Then letting s be the number of correct responses after r (s may equal 0) we can write any such sequence as

, r+s<4 Eq 6

where we subscript this probability by FPP to designate this as the joint probability derived from the Fixed Path Poisson Model. Note that if r=4, s=0 and we define

in keeping with the truncated Poisson distribution that results from the fact that the observer cannot get more than four digits correct.

Although we could use the four-way joint probabilities to test the Fixed-Path Poisson Model, there are several aspects of the data that cast doubt on the strict left-to-right processing order. First, the serial position curves for the 3 observers show evidence for reversals. For example, Observer SS performs better on stimulus location 2 than location 1. In addition, as noted, the Fixed Path Poisson Model assumes that the probability of a correct digit to the right of an error is equal to the guessing rate:

Eq 7

where j>i and indicates that the digit in location i was incorrect. Analysis of the data demonstrates that this assumption is violated for all three observers. These findings suggest that a revision of the Fixed Path Poisson Model is in order. We now present a new model that can account for these violations as well as the four-way joint probabilities, and then test the four-way joint probabilities using standard log-likelihood procedures.

Weighted Path Poisson Model

The Fixed Path Poisson Model makes the unrealistic assumption that observers invariably begin working on the left-most position and work to the right. This path represents only one of 4!=24 possible paths that the observer might take, and evaluating each of these paths, or combinations of paths, would require more parameters than data points. An alternative is a model that computes the probability of each of these 24 paths from just 4 parameters that express the likelihood that a given stimulus location appears in different processing positions. We can take a cue from Bundesen (e.g. , 1993) in which path selection is related to Luce's Choice Model (e.g., 1959), although in Bundesen's model, processing is assumed to be parallel and independent. Even our parallel model that is equivalent to this serial interpretation, differs from Bundesen's parallel model, in that reallocation of capacity produces a positive dependence in our model (see Appendices A and B). This new model, termed the Weighted Path Poisson Model, computes the marginal probability of correctly identifying the digit in stimulus location i as,

Eq 8

were P(j, t, to) comes from Eq 4 and ui,n represents the probability that a given stimulus location i appears in processing position n of the poisson process. The rest of the term represents the probability that the Poisson process reaches position n, or if not, the digit is obtained through the guessing process with probability g. The value of ui,n is determined by four weight values wi, i = 1, 2, 3, 4, which are positive and constrained to sum to one. The probability that stimulus location i is processed in position n of the Poisson process is,

Eq 9

An example illustrates this computation. Suppose that weights wi...wl are .45, .30, .20 and 0.05. Given these weights, stimulus location 2 is processed first (n=1) 30% of the time. This leaves three stimulus locations to be processed. The probability that stimulus location 1 is processed second (n=2) given location 2 was processed first is .45/(.45+.20+.05). The probability that location 3 is processed third (n=3) is .20/(.20 + .05). This leaves only location 4 to be processed last, which it is with probability .05/.05, or 1.0. Of course for short stimulus durations the Poisson process may not get all the way to the last position, and if not, the observer guesses the remaining digits.

Often it is necessary to compute the probability of any one of the 24 possible paths. Define [Ji], i=1..4 as one of the possible paths. The Ji indexes the stimulus location corresponding to processing position i. The probability of path [Ji] can be directly computed from the weight values:

Eq 10

where wa..wd are the weight values wi..wl from Eq 9 above. In the example above, [Ji] = (2, 1, 3, 4) and .

Log-Likelihood Tests of the Independence and Weighted Path Models

The four-way joint probabilities exhaust all inter-item dependency information and, assuming inter-trial independence, form a set of sufficient (complete) statistics for the data set and for testing dependencies in particular. Predictions for the four way joint probabilities can be obtained for the Weighted Path Poisson Model by computing the four way joint probabilities for the Fixed Path Poisson Model and then using these to find the correct probability for any processing path permutation. As with the Fixed Path Poisson Model, define Xi = {0,1}, i=1..4, which represents whether the digit was incorrect or correct on stimulus location i. Define Yi = {0,1}, i=1..4, which represents whether the digit was incorrect or correct on processing position i in the Poisson process. As with Eq 10, the Ji indexes the stimulus location corresponding to processing position i. The four way joint probabilities for the Weighted Path Poisson Model are then defined as

Eq 11

where represents the set of all 24 permutations, computes the probability of path [Ji] from Eq 10 and PFPP gives the appropriate four way joint probability from the Fixed-Path Poisson Model from Eq 6. The Ji subscripts clearly provide a redirection of the Poisson process so that instead of moving strictly from left to right, the process moves through any one of the 24 possible orders, while still preserving its serial nature of processing one stimulus location at a time. For example, to compute the probability of , first consider the case where the third stimulus location is processed first, followed by the second, first and fourth locations. Define Xi = (0, 1, 1, 0) and J = (3, 2, 1, 4) as noted, and then find the probability from the Fixed Path Poisson Model. Finally, weight this by the probability of taking processing path 3, 2, 1, 4 via Eq 10 and the 4 weight values. This process is repeated for all 24 possible processing paths and the results are summed to compute the four way joint probability .

The Weighted Path Poisson Model has 5 free parameters (to, l and wi, wj, wk). The parameter wl is not included in the estimation since the weights sum to 1.0.

A fair comparison between the Weighted Path Poisson Model and the Independent Sampling Model allows different processing rates on the four stimulus positions for the Independent Sampling Model. As with the Fixed-Path Poisson Model, define Xi = {0,1}, i=1..4, depending upon whether the digit in stimulus location i through l is incorrect or correct. For the Independent Sampling Model, the joint probability for a given Xi is defined to be

Eq 12

where P(Cj) is the probability of obtaining a digit in location j from Eq 1, using either the random sampling process with sampling rate 1/cj or by guessing. For example,

Eq 13

where P(Cj) is the probability of obtaining a digit in location i from the independent sampling process with rate parameter 1/cj from Eq 1 or by guessing. The four-way joint probability is the product of these four probabilities, under the independence assumption. Using Eq 13 we computed the four-way joint probabilities for the independence model assuming 5 free parameters: to and 4 processing rates 1/c1, 1/c2, 1/c3, 1/c4.

Having derived predictions for the four-way joint probabilities for the two models, we can now test these by minimizing the log-likelihood statistic via parameter estimation techniques. The log-likelihood statistic is computed as

–    Eq 14

where N(dur, X1, X2, X3, X4) is the number of occurrences in the data for a particular four way joint event, stimulus duration and observer, as given in Table 5. DataJnt(dur, X1, X2, X3, X4) is the relative frequency of this joint event occurring in the data for a particular stimulus duration dur and observer, which is computed by dividing each joint count by the total number of trials at that stimulus duration. TheoryJnt(dur, X1, X2, X3, X4) is the predicted four-way joint probability for a given stimulus duration and particular configuration of correct and incorrect positions X1...X4 for the Weighted Path Poisson or Independence models. When multiplied by 2.0, this statistic is distributed as Chi-Square. The number of degrees of freedom equals (16-1) times the number of above-chance stimulus durations, minus the number of free parameters (5 for both models). The value of tdurs is derived from the number of stimulus durations that produce above-chance and below-ceiling performance, since the inter-item dependencies do not discriminate between models at these extremes.

Table 5 contains the maximum likelihood values obtained for the two models for the three observers. As anticipated from the original tests of independence, the Independent Sampling Model is rejected for all three observers. The Weighted Path Poisson Model does better than the Independent Sampling Model for all 3 observers, and substantially so for Observers SS and EF. In addition, the model is not rejected for Observer EF. The log-likelihood statistic is fairly close to the critical X2 value for the other two observers.

Insert Table 5 About Here

Predictions for the dependency statistic Ave[P(Ci|Cj)-P(Ci)] and P(C|k) were generated by simulating the Weighted Path Poisson Model using the estimated column weights wi,..wl , the processing rate l and the pre-processing delay to. Each condition was simulated for 15,000 trials using the same digit set as the original experiment. This simulation involves picking a path for the Poisson model to take on a particular trial, and then applying the Fixed-Path Poisson Model to this particular path according to Eqs 2 and 3. The probability of any given path [Ji] can be directly computed from Eq 10. Once a path is chosen, correct and incorrect digits are assigned according to Eqs 2 and 3 and the weight, rate and delay parameters for each observer (see Table 1). As a test of the simulation procedures, the analysis produced a data set that contained four-way joint probabilities which were quite close to those derived analytically from the Weighted Path Poisson Model using the appropriate parameters.

Figure 4 shows the predictions for the Weighted Path Poisson Model for the three participants for the P(C|k) statistic. The RMSE's are all similar to those of the Fixed-Path Poisson Model, as expected, and all are lower than those of the Independent Sampling Model. In addition, Table 3 lists the predictions for the statistic Ave[P(Ci|Cj)-P(Ci)] for the Weighted Path Poisson Model, which are all similar to the predictions made by the Fixed Path Poisson Model.

Insert Figure 4 About Here

Variable Attention and Independent Sampling

The Weighted Path Poisson Model clearly handles our two dependency statistics and the four-way joint probabilities better than the modal independence class of models, and comes much closer to predicting the serial position effects than the Fixed Path Poisson Model. It is obviously not feasible to anticipate all other models, based for example, on distributions other than the Poisson, that may yield positive dependencies. However, it seemed important to rule out, if possible, a model that could predict positive dependencies by artifact, in the sense that its across-trial behavior might produce dependencies.

In this model, attention (or some other facility positively related to performance) is assumed to vary across trials, but remains constant within a trial. Such variable attention models have attracted the notice of researchers for many years (e.g., Atkinson, 1973; Norman, 1964), although there is some evidence against this notion, at least in featural dependencies (e.g., Townsend, Hu and Kadlec, 1988). Van Zandt and Ratcliff(1995) have recently argued for a similar concept within the context of diffusion processing models. The result is, of course, a probability mixture over trials of the predicted function of the varying parameter. It does seem a natural idea that people’s attention might wax and wane over a prolonged testing session. Alternatively, such variations in processing may come from the stimulus itself, or low-level mechanisms such as the position of fixation during a trial. Whatever the mechanism, the result is that the processing of all four stimulus locations can be either higher than average or lower than average on a given trial. This effectively correlates the processing of the individual items, even if processing occurs independently for each item within a given trial.

One way to incorporate the idea of variable attention across trials is to append this mechanism to the Independent Sampling Model. Currently, the ISM has 4 processing rates on the four stimulus locations. If the observer is currently in a high attentional state, we would expect that processing would increase at all four locations, but proportionately more at locations that were proceeding at a faster rate. Thus attention can be thought of as a multiplicative weight that adjusts the individual processing rates on the fours stimulus locations to reflect the current state of attention. One could add a variable attention component to an model (and below we add this component to the Weighted Path Poisson Model as well). However, for simplicity, we will refer to the variable attention version of the Independent Sampling Model as the Variable Attention Model. While this model still includes an independent sampling mechanism, it now predicts positive inter-item dependencies.

While we might fit a model that includes just a high and a low attentional state, along with the probability of being in the high state, it seems more likely that the attentional states constitute a continuous random variable that is normally distributed. For computational ease, we chose a 5-state distribution that is approximately normally distributed. The mean of this distribution would reflect the processing rates for the average attentional state. To develop this model, we approximated a normal distribution with a binomial distribution with n=4. This provides an approximately bell-shaped density function with 5 discrete states. To compute overall performance, we simply compute performance for the four-way joint probability when in each of 5 attentional states and perform a weighted average of the joint probabilities.

The mean of the binomial function, m, was a free parameter that typically ended up at a value near 0.5. Thus for each of the 5 attentional states a, the probability of being in a particular state a(a), a={0..4} is given by:

Eq 15

The spread of attention, or what might be thought of as the variance of attention across these 5 attentional states, is given by a free parameter s. The combination of the attentional weight scaling the processing rate determines what we term the pace of processing. For stimulus location i, the pace of information acquisition is,

Eq 16

When a=2, we multiply each rate by 1.0, and thus the processing pace occurs at the average rate assigned to each stimulus location. When a=1, we reduce the attention multiplier by one standard deviation unit s, such that each rate is multiplied by (1.0-s). Once the rate for each location is determined, processing is assumed to occur in parallel and independently, and the four-way joint probabilities are given by Eq 12. For all five attentional states, the overall joint probability for a given Xi is determined by,

Eq 17

where to compute P(Cj), we scale the rate parameter by the appropriate attentional state weighting parameter,

P(Cj) = (1.0 - e-(d-L)(1/(c (1-(a-2)s))))+ g(e-(d-L)(1/c)(1-(a-2)s))) Eq 18

The end result is a distribution of attentional weights that scale the processing rates. For a given trial, the distribution of attention weights is sampled, and the chosen value is used to scale the four processing rates on the four stimulus locations. This produces four processing paces for the four locations. Note this is equivalent to assuming that once an attentional state is chosen for a trial, the associated attention weight is used to scale all four processing rates.

This model was fit to the four-way joint probability data, and produces surprisingly (at least to us) good results, as shown in Table 6. Even allowing for the addition of two free parameters, the Variable Attention Model accounts for the dependency data better than the Weighted-Path Poisson Model, and in all three cases the log-likelihood value is below the critical c2 value. Nevertheless, it should be observed that these good fits require extremely large shifts in attention. All three observers have s values greater than .45; .5 is the maximum value that s could take, since values greater than .5 imply negative processing pace.

When evaluating the fit of any model, it is important to ask whether the parameter values that enable good fits are reasonable. For example, consider Observer EF's performance at the 80 ms presentation. The model assumes that she is in one of 5 attentional states with the following probabilities (from the lowest to highest attention states): {0.035, 0.18, 0.36, 0.32, 0.10}. The probability that she obtains a correct digit (averaged across all 4 locations) when in these 5 states is {0.24, 0.28, 0.35, 0.52, 1.0}. This is equivalent to changing the stimulus duration from 70 ms to 200 ms! Such wild shifts in attention across trials seem improbable to us in the conventional sense of attention waxing and waning within a block of trials. This aspect will be discussed further below.

Fluctuating attention would seem likely to vary slowly over trials. In order to evaluate this possibility, we performed an autocorrelation on the trial-by-trial data by first subtracting off the mean performance level for each condition on each trial, and then autocorrelating the data at lags up to half of the experiment. None of the three observers showed any structure at all in the autocorrelation function. This was much the same result as in Townsend et al. (1988).

We attempted two variants of the Variable Attention Model that might perform with more reasonable parameters. First, we fixed s to be 0.12, which was chosen only because it represented a number that is in line with the smaller shifts of attention that phenomenologically occur during a block of trials. This results in fits that are markedly worse than the full model, and in all 3 cases worse than the Weighted Path Poisson Model. We also attempted a two-state attention model in which the observer was assumed to be either in a low or a high attentional state with some probability. This also produced worse fits than the Variable Attention Model described above, and also required wide shifts in attention across trials in order to account for the dependencies. However, this model did perform better than the Weighted Path Poisson Model. We also appended a variable attention mechanism to the Weighted Path Poisson Model, but found that it did not markedly improve the fits.

It appears that we cannot reject the Variable Attention Model, although the means by which it derives its good predictions are somewhat suspect under the usual interpretation of attention varying across trials. However, as intimated earlier, there might be other influences that could change across trials but remain fixed within a trial. For example, although all three subjects were well-practiced, the position of the eyes during a trial may have influenced whether performance was good or poor overall on this trial. Other sources of variability may come from the stimulus itself, such that certain number combinations might be easier to read than others. Also, as noted earlier, the perceptual integrity of the incoming information might vary, perhaps as the result of Gaussian noise. If so, it might well be independent from trial to trial (and thus produce zero or irregular autocorrelation functions) and perhaps even be normal in distribution. Perhaps some combination of these various factors could be responsible for high and independent variability across trials.

Although we have contrasted the serial Weighted Path Poisson Model with the parallel Independent Sampling Model and its variable attention variant, we wish to reemphasize that both models can be mimicked by models of other architectures. As noted by Hughes & Townsend (1998), accuracy alone may give little information about process architecture. The focus of the present discussion is on the ability of the various models to account for inter-item dependencies, which is somewhat independent of the architecture of the model. A limited discussion of the ability of the present serial model and parallel architectures to mimic one another can be found in Appendix B.

Discussion and Conclusions

We conclude, based on log-likelihood analyses of the four-way joint probabilities as well as analyses of the two conditional-probability statistics Ave[P(Ci|Cj)-P(Ci)] and P(C|k), that the Independent Sampling Model, or in fact any of its modal model cousins, cannot account for several aspects of performance in the digit-recall task. In particular, a non-parametric independence statistic verifies the existence of positive inter-item dependencies that are contrary to the Independent Sampling Model. The Fixed-Path Poisson Model could account for the two inter-item dependencies but not the serial position statistics and the probability of obtaining a digit to the right of an error.

To account for these last two statistics we developed the Weighted Path Poisson Model. This model computes the probability of taking one of 24 possible paths through the 4 digit locations based on three weights that determine the likelihood that a given digit location appears in different processing positions. We tested the Weighted Path Poisson Model against the Independent Sampling Model using log likelihood statistics and found that the Weighted Path Poisson Model performed better for all three observers. The Independent Sampling Model was rejected for all three observers, while the Weighted Path Poisson Model produced a log-likelihood value that was below the critical X2 value for one observer and near the critical value for the other two observers.

Finally, we developed a version of the Independent Sampling Model that assumes that inter-item dependencies result from attention varying across trials, termed the Variable Attention Model. This model assumes that processing is independent across channels within a trial, but that due to variation in attention across trials, an overall positive dependence is fomented. This model also allows serial position effects. This model did a better job than the Weighted Path Poisson Model, and actually produced log-likelihood values that were below the critical values. However, it derives its good fits by permitting extremely large shifts in attention that seem too massive to be produced by waxing and waning attention. In addition, no evidence for systematic attentional shifts across trials were found in the autocorrelation plots. As a result, this model seems difficult to accept without assuming that other mechanisms are responsible for across-trial variability, mechanisms that do not produce correlated shifts in attention. A distribution in signal quality, for instance Gaussian, perhaps in conjunction with attentional oscillation, might be able to accommodate such large swings in performance.

One possible explanation for the rejection of the Weighted Path Poisson Model for Observers TB and SS may be that the positive dependencies were not quite as strong as those predicted by the Weighted Path Poisson Model. Inspection of Table 2 and Figure 4 reveals that although the dependencies are clearly positive, they are not quite as large as the model's predictions. One way to account for these effects would be to include a limited-capacity channel, perhaps comparable to the fixed sample size buffer notion of Townsend (1981) or that proposed by Bundesen (1990; see also Shibuya & Bundesen, 1988), at the back end of the poisson process. Limiting the processing capacity produces negative dependencies, which might offset the strong positive dependencies produced by the Weighted Path Poisson Model.

The observed dependencies are positive, contrary to those reported by Townsend(1981), the only other full report study to analyze dependencies so far. Townsend's data showed slight negative dependencies, attributable to guessing, that are consistent with an independent sampling model. At this point, it not known what led to the differences in findings in our present analysis verses those in the Townsend (1981) study. Although the stimuli in both experiments were post-masked, the Loftus et al. (1993) stimuli were very low contrast (around 5%), while the Townsend (1981) stimuli were black typewritten letters on white cards presented at high luminances. Also, pilot data from that study indicated that presentation of one or two stimuli in any of the display locations led to virtually perfect performance. This may have placed the Loftus et al. (1993) stimuli in a data-limited domain, while the bright stimuli of the Townsend (1981) study may have been in a process-limited domain due to time constraints. In addition, the Townsend stimuli were much more widely separated and the letters subtended a smaller visual angle than the Loftus, et al (1993) stimuli, possibly encouraging perceptual independence.

With regard to the data limited hypothesis, the present dependency results are more compatible with those of Townsend and colleagues (e.g. Townsend, Hu & Evans, 1984) which found uniformly strong positive dependencies, albeit among full report of features in a pattern recognition, rather than a full report of letters, experiment.

Although the Poisson class of models, whether given a serial or parallel interpretation, clearly accommodated the Loftus et al data better than the modal independence model, the Poisson models may have grave difficulty in experiments with larger display sizes. The reason is basically the same that disconfirmed Rumelhart's (1970) Multicomponent Model: larger display sizes will impose asymptotic accuracy bounds less than 1.0. The Poisson models predict that accuracy in even the less favored locations will eventually go to P(Ci) = 1. If other experimental conditions with high workload continue to produce dependencies, it may be necessary to combine features of Townsend's (1981) Bounded Performance Model with some form of positive dependency structure such as the Poisson process.

The positive dependencies observed in the data contradict fixed-size buffer models, such as that proposed by Shibuya & Bundesen (1988). These models predict a negative correlation between items, which produces negative dependencies.

Finally, and perhaps most significantly, the present analyses demonstrate that inter-item dependencies in the present data disconfirmed the assumption of independence between items processed from a multi-item display and thereby brought into question a fundamental assumption of what continues to be a modal information processing conception of whole report behavior. More experimental research will be required to tease out the conditions under which dependencies do or do not manifest themselves and what implications lie ahead for finer grained models of whole report processing. The present data and tests suggest that inter-item dependencies can provide strong model discrimination and should lead to a better articulated model of whole report processing.

 

References

Atkinson, R.C. (1977). A variable sensitivity theory of signal detection. Psychological Review, 93, 154-179.

Bundesen, C. (1990). A theory of visual attention. Psychological Review, 97, 523-547.

Bundesen, C. (1993). The relationship between independent race models and Luce's choice axiom. Journal of Mathematical Psychology, 37, 446-471.

Bundesen, C. (1987). Visual attention: Race models for selection from mutielement displays. Psychological Research, 49, 113-121.

Bundesen, C. & Pedersen, L. F. (1984). Measuring efficiency of selection from briefly exposed visual displays: A model for partial report. Journal of Experimental Psychology: Human Perception and Performance, 10, 329-339.

Busey, T.A. and Loftus, G.R. (1994). Sensory and cognitive components of visual information processing. Psychological Review, 101, 446-469.

Gegenfurtner, K. & Sperling, G. (1993). Information transfer in iconic memory experiments. Journal of Experimental Psychology: Human Perception and Performance, 19, 845-866.

Hughes, H.C. & Townsend, J.T. (1998). Varieties of binocular interaction in human vision. Psychological Science, 9, 53-60.

Kullback, S. (1968). Information theory and statistics. New York: Dover.

Loftus, G. R., and Busey, T.A. (1992). Multidimensional models and iconic decay: Reply to Di Lollo and Dixon. Journal of Experimental Psychology: Human Perception and Performance, 18, 556-561.

Loftus, G. R., Busey, T.A. & Senders, J. (1993). Providing a sensory basis for models of visual information acquisition. Perception & Psychophysics, 54, 535-554.

Loftus, G. R., Duncan, J. & Gerhig, P. (1992). On the time course of perceptual information that results from a brief visual presentation. Journal of Experimental Psychology: Human Perception and Performance, 18, 530-549.

Loftus, G. R., & Ruthruff, E. (in press). A linear-filter theory of visual information acquisition with special application to intensity-duration tradeoffs. Journal of Experimental Psychology: Human Perception & Performance.

Luce, R.D. (1959). On the possible psychophysical laws. Psychological Review, 66, 81-95.

Massaro, D.W. (1970). Perceptual processes and forgetting in memory tasks. Psychological Review, 77, 557-567.

McGill, W.J. (1963). Stochastic latency mechanisms. In Luce, R. D., Bush, R.R. & Galanter, E. (Eds.) Handbook of Mathematical Psychology. New York: John Wiley & Sons, Inc.

Norman, D.A. (1964). Sensory thresholds, response biases, and the neural quantum theory. Journal of Mathematical Psychology, 1, 88-120.

Rumelhart, D.E. (1970). A multicomponent theory of the perception of briefly exposed visual displays. Journal of Mathematical Psychology, 7, 191-218.

Shibuya, H., & Bundesen, C. (1988). Visual selection from multielement displays: Measuring and modeling effects of exposure duration. Journal of Experimental Psychology: Human Perception and Performance, 14, 591-600.

Sperling, G. (1960). The information available in brief visual presentation. Psychological Monographs, 74, 1-29.

Sperling, G. (1967). Successive approximations to a model for short-term memory. Acta Psychologica, 27, 285-292.

Townsend, J. T. (1974). Issues and models concerning the processing of a finite number of inputs. In B.H. Kantowitz (Ed) Human Information Processing: Tutorials in Performance and Cognition. Hillsdale; NJ: Erlbaum Associates, Inc. pp. 133-168.

Townsend, J. T. (1981). Some characteristics of visual whole report behavior. Acta Psychologica, 47, 149-173.

Townsend, J. T. & Asby F. G. (1983). Stochastic Modeling of Elementary Psychological Processes. Cambridge: Cambridge University Press.

Townsend, J. T., Hu, G. G. & Evans, R. J. (1984). Modeling feature perception in brief displays with evidence for positive interdependencies. Perception & Psychophysics, 36, 35-49.

Townsend, J. T., Hu, G.G. & Kadlec (1988). Feature sensitivity, bias, and interdependencies as a function of energy and payoffs. Perception & Psychophysics, 43, 575-591.

Van Zandt, T. & Ratcliff, R. (1995). Statistical mimicking of reaction time data: Single-process models, parameter variability and mixtures. Psychonomics Bulletin and Review, 2, 20-54.

 

Appendix A- Capacity and Inter-Item Dependencies

The question of capacity often is raised within the context of inter-item dependencies. Basically, capacity as a function of load is clearly an across-condition concept, since it load must be varied. However, it can kind of come in indirectly, e.g., in the fixed sample size model which predicts negative dependency due to the fixed number of slots in a buffer (or whatever). The Fixed and Weighted Path Poisson Models are mute with regard to capacity in any direct sense, because we do not have to say anything about what happens when load is varied, in either the serial or parallel interpretations. If load were varied, that would force us to make some kind of capacity commitment, which might be different for the serial and parallel incarnations. For example, if one stayed faithful to a serial interpretation, then capacity is unlimited at the individual element level but obviously limited at, say, the exhaustive processing level. On the other hand if the parallel distribution was still held to be Poisson as load changed, capacity would be limited at the individual element level, since the overall rate, say V, would be split up among all the n elements, and then reallocated among the remaining ones, as processing went along. As may be apparent now, capacity is only loosely related to the dependency issue. It is not entirely orthogonal, because something like reallocation not only affects dependence, but affects things like overall capacity measures on exhaustive processing, etc.

A good way to see that the present serial models predict positive dependency, is to view the parallel interpretations, which allow reallocation: Think about what happens when one asks what is the probability that item "b" is finished by time t if he already knows that "a" is done by then. Since "a" is done by then, "b" must have spent some time at its higher reallocated rate and thus has higher (than its marginal) probability of being done too—that’s a positive dependency (see Townsend & Ashby, pp. 68-76 for a good discussion of this). To see how a limited capacity independent model can be limited (in fact fixed) capacity, just look at Parallel #3 (p.85) in Townsend and Ashby. That and other models on capacity (pp. 76-91) show the relations between capacity and dependency. As another e.g., suppose a model’s capacity = C which is constant over n, but once allocated across elements and in fact, is in the Bounded Performance Model in Townsend (1981). However, if there exists a distribution on allocation at the start of each trial, that would produce a negative dependence among the elements. The fact that our variable attention model starts with a high and low before processing begins, and everything is either high or low, engenders a positive dependence. See Townsend & Ashby (1983), Chapter 4 for these and other results on inter-item dependencies.

 

Appendix B- Serial and Parallel Model Mimicking

The Weighted Path Serial Poisson Model is mathematically equivalent to a parallel model with the following characteristics. Assume that when processing starts, processing begins at each position with an exponential distribution and at rates V11, V21, V31, V41, where the first subscript denotes physical location and the second, stage 1. These are constrained such that

Eq A1

where l represents the serial processing parameter. Set

Eq A2

in the serial Weighted Path Model. Suppose the item in stimulus location j was processed first. Then we assume that all processing (capacity) allotted to j is reallocated, after j's completion, to the remaining items in such a way that the relative magnitudes are the same as in stage 1. This implies, that, say,

Eq A3

and that . This model is a special case of the Reallocation Parallel Model which is equivalent to an exponential random path serial model (Townsend, 1974; Townsend & Ashby, 1983, pp. 88-89). It is clearly equivalent to the Weighted Path Serial Model.

The data, especially the inter-item depending data, are clearly supportive of positive dependence at about the level of our Weighted Path Serial or Parallel Poisson Reallocation Models. Further, the Variable Attention Model was not able to simultaneously encompass both the dependence structure and the variance statistic.

We will skirt detailed descriptions of other mimicking models but namely note that the Variable Attention Model depicted as parallel could be presented as serial. All that is necessary is to provide the serial model that mimics a parallel independent model to also possess two distinctive sets of processing rates (e.g. see Townsend & Ashby, 1983, pp. 82-83) corresponding to the switches in attention within the Variable Attention Model.

On the other hand, the Fixed Path Poisson Model cannot be perfectly mimicked by any parallel model because any parallel model attempting to mimic this model has all its rate parameters but one going to zero at any particular stage. Obviously, a parallel model could nevertheless closely approximate that model.

Although certain of these models have more compelling interpretations in one of the architectures, it is their dependency structure that is being most strikingly tested in this study.

 

Author Note

While conducting this research Thomas Busey was supported by a two-year National Institute of Mental Health fellowship. Dr. Townsend was supported in part by a National Science Foundation Grant #9112813. The authors gratefully acknowledge Dasha Kinelovsky for her assistance with the joint probability predictions.

 

 

Tables

Observer

1/c

L

l

to

w1

w2

w3

w4

TB

117.6

84.6

0.0275

91.3

0.587

0.308

0.0758

0.0290

SS

120.1

72.5

0.0261

77.4

0.273

0.479

0.228

0.0203

EF

32.4

62.6

0.0717

61.2

0.281

0.535

0.156

0.0285

Table 1. Model parameters for the Independent Sampling Model (1/c and L), the Fixed Path Poisson Model (l and to) and the Weighted Path Poisson Model (w1...w4). The Weighted Path Poisson Model, described in a subsequent section, uses the l and to parameters estimated from the Fixed Path Poisson Model applied to the marginal data of Figure 1. See Figure 1 for RMSE's associated with each model fit.

 

 

Stimulus Duration (ms)

Observer TB

Observer SS

Observer EF

40

3.18

7.94

8.70

50

2.79

9.90

9.11

63

3.70

5.13

11.8

80

12.4

6.24

22.4

100

6.00

22.9

23.2

126

30.6

33.6

13.2

159

11.4

31.6

0.62

200

21.8

21.0

0.00

Table 2. Independence statistics computed from Eq 4 for three observers at 8 stimulus durations. Values that exceed 19.67 demonstrate evidence of inter-item dependencies. Stimulus durations of less than 80 ms for TB and SS and 63 ms for EF produce only chance performance, which will not contain dependencies. Longer stimulus durations produce evidence of inter-item dependencies, although Observer EF's performance reaches 1.0 at long stimulus durations and thus she cannot show inter-item dependencies at these durations.

 

Model

Core Assumptions

Predicts Positive Inter-item Dependencies

Rank Order Accounts of Log-Likelihood

Independent Sampling Model

Processing proceeds in parallel and independently on all four stimulus locations. The processing time is exponentially distributed with a pre-processing delay. Each stimulus location has a separate processing rate.

no

4

Fixed-Path Poisson Model

Processing occurs serially in a left-to-right order. The processing times for each location are exponentially distributed; the overall process has a pre-processing delay.

yes

-

Weighted Path Poisson Model

Identical to the Fixed-Path Poisson model, with the exception that processing can occur through any one of the 24 possible processing paths through the 4 stimulus locations.

yes

2

Variable Attention Independent Sampling Model

As with the Independent Sampling Model, processing occurs in parallel with separate rates for the four stimulus locations. Attention is assumed to vary across trials (but remain fixed within a trial). This attention is assumed to be pseudo-normally distributed. The value of the attention parameter modulates the processing rate for each location by a multiplicative amount. Thus in a relatively high attentional state, all four processing rates will be high.

yes

1

Variable Attention Independent Sampling Model- Restricted Variance

Same as the Variable Attention Independent Sampling Model, with the exception that the range of possible attentional states is restricted to a relatively narrow range that is consistent with relatively flat auto-correlation functions observed in the data.

yes

3

Table 3. Descriptions of the assumptions underlying the 4 models tested in the current work. The core assumptions listed describe the typical interpretations of the model; see Appendix A for alternative interpretations which provide isomorphic models.

 

Observer TB

Stimulus Duration (ms)

Ave[P(Ci)]

Ave[P(Ci|Cj)-P(Ci)]

Fixed Path Poisson:

Weighted Path Poisson:

Variable Attention Independence:

80

0.130

0.080

0.026

0.001

0.002

100

0.139

0.019

0.043

0.008

0.024

126

0.301

0.171

0.119

0.074

0.082

159

0.588

-0.003

0.140

0.107

0.051

200

0.676

0.065

0.096

0.080

0.031

RMSE

   

0.027

0.040

0.035

Observer SS

Stimulus Duration (ms)

Ave[P(Ci)]

Ave[P(Ci|Cj)-P(Ci)]

Fixed Path Poisson:

Weighted Path Poisson:

Variable Attention Independence:

80

0.102

-0.038

0.006

-0.002

0.002

100

0.245

0.068

0.071

0.031

0.021

126

0.431

0.072

0.134

0.078

0.068

159

0.574

0.063

0.132

0.096

0.075

200

0.713

0.058

0.082

0.068

0.061

RMSE

   

0.022

0.018

0.018

Observer EF

Stimulus Duration (ms)

Ave[P(Ci)]

Ave[P(Ci|Cj)-P(Ci)]

Fixed Path Poisson:

Weighted Path Poisson:

Variable Attention Independence:

63

0.132

0.125

0.004

0.007

-0.005

80

0.417

0.119

0.140

0.083

0.098

100

0.757

0.051

0.088

0.085

0.026

126

0.889

0.014

0.023

0.031

0.007

159

0.951

-0.001

0.003

0.006

0.001

200

0.972

0.000

0.000

0.000

0.000

RMSE

   

0.049

0.048

0.053

Table 4. Inter-Item Dependency Statistic Ave[P(Ci|Cj)-P(Ci)], compared with the predictions of the Fixed-Path Poisson Model, and the Weighted Path Poisson model, which is described in a subsequent section. The Independent Sampling Model predicts that the values for Ave[P(Ci|Cj)-P(Ci)] will be 0.0. A sign test aggregating all stimulus durations for the three participants demonstrates that out of 16 computed dependencies, 12 are positive, which is significant at a = 0.05. This disconfirms the Independent Sampling Model. The Fixed Path poisson model can account for these dependencies, while the Weighted Path Poisson and Variable Attention Independence models can as well.

 

Dur.

Observer SS

Observer EF

Observer TB

40

28

5

6

0

17

2

5

0

36

2

1

0

5

1

1

1

5

0

4

0

5

1

0

0

4

1

1

0

1

1

1

0

7

0

0

0

0

0

1

0

0

0

0

0

2

0

0

0

50

42

1

2

0

28

2

2

0

41

4

4

0

5

1

0

0

2

0

0

0

1

0

0

0

2

0

0

0

1

0

0

0

3

0

1

0

0

1

0

0

0

1

0

0

0

0

0

0

63

37

2

4

0

23

1

2

0

38

1

2

0

5

1

0

0

3

1

2

0

3

0

0

0

5

0

0

0

2

0

1

1

9

1

0

0

0

0

0

0