Note: Some browsers may not load the entire document. You may need to increase the size of the memory partition to read all 95 pages. Word may not have translated the formulas properly, and so you should read the postscript version for complete accuracy.

Temporal Inhibition in Two-Pulse Character

Detection and Identification Tasks

by

Thomas A. Busey

A dissertation submitted in partial fulfillment

of the requirements for the degree of

Doctor of Philosophy

University of Washington

1994

Approved by ______________________________________________________

(Chairperson of Supervisory Committee)

______________________________________________________

______________________________________________________

______________________________________________________

Program Authorized

to offer Degree_____________________________________________________

Date_____________________________________________________________

In presenting this dissertation in partial fulfillment of the requirements for the Doctoral degree at the University of Washington, I agree that the Library shall make its copies freely available for inspection. I further agree that extensive copying of this dissertation is allowable only for scholarly purposes, consistent with "fair use" as prescribed in the U.S. Copyright Law. Requests for copying or reproduction of this dissertation may be referred to University Microfilms, 1490 Eisenhower Place, P.O. Box 975, Ann Arbor, MI 48106, to whom the author has grated "the right to reproduce and sell (a) copies of the manuscript in microform and/or (b) printed copies of the manuscript made from microform."

Signature ______________________

Date __________________________

University of Washington

Abstract

Temporal Inhibition in Two-Pulse Character

Detection and Identification Tasks

by Thomas A. Busey

Chairperson of the Supervisory Committee: Professor Geoffrey R. Loftus

Department of Psychology

The visual system's response to a brief visual stimulus may be modeled by a linear filter that has the effect of temporally blurring rapidly-changing visual stimuli. Such a filter has been previously used in a theory of character recognition by Loftus and Busey (Loftus, Busey & Senders, 1993; Busey & Loftus, 1994). This theory differs from related theories of detection in several ways, most notably in that it assumes no temporal inhibition. Such inhibition is indicated when the visual system's response to a brief stimulus initially becomes excitatory, and then becomes inhibitory. Ample evidence for temporal inhibition exists in the two-pulse detection literature, although these experiments all used simple stimuli, e.g. disks or gratings, in detection tasks. The no-inhibition assumption implicit in the Loftus and Busey theory was tested in both detection and identification tasks, using digit stimuli. Digits differ from gratings in that they require classification into one of many learned categories and have broad spatial frequency distributions. Data from both experiments were consistent with previous detection literature, suggesting that the temporal chracteristics of digits are no different than those of simpler stimuli. The data indicated clear evidence for temporal inhibition, in both detection and identification tasks. This finding disconfirms the present instantiation of the Lofuts & Busey theory of character identification. When this theory was amended to include for temporal inhibition, it gave a good account of the data.

TABLE OF CONTENTS

Page

List of Figures iii

List of Tables v

Introduction 1Early Data: Measuring the Persistence of a Stimulus 5Direct Evidence for Temporal Inhibition: Ikeda and Rashbass 8The Summation Index s 8Evidence for Temporal Inhibition 10Variables Affecting Temporal Inhibition 19Effects of Background Level 19Effects of Stimulus Size 19Effects of Positive- vs. Negative-Contrast Stimuli 22Linear Filter Models of Two-Pulse Detection Data 22Overview of Methodology 23Characterization of the Physical Stimulus 24An Overview of Linear Filter Model Components 25A Linear-Filter Model of Character Identification Data 31A Digit-Recall Task 31Theoretical Overview 33Initial Stages of Processing: The Sensory Response Function 34Information Loss: The Sensory Threshold 35Acquired Information: The Information Extraction Rate 36Performance Predictions 37Summary 38Modeling Inhibition: Linear-Systems Theories of Detection Thresholds 39The Linear-Filter Model of Sperling and Sondhi 39Watson and Nachmias: Ellipse Models of Sinusoidal Gratings 42Uchikawa & Ikeda: Chromatic Double Pulses 45Roufs and Blommaert: Signal Perturbation Technique 46den Brinker: Fourth-Order Impulse Response 49Bowen: Peak Detection: Ratios of Peaks in # of Flashes Detected 49Ohtani and Ejima: Sinusoidal Gratings 50Summary of Two-Pulse Sensitivity Models 53Empirical Evidence for Inhibition in Character Detection and 55Identification Tasks 55Experiment 1: Threshold Detection of Pulsed Digits 57Experiment 2: Identification of Different-Contrast Pulsed Digits 64General Discussion 72Conclusions 74REFERENCES 75LIST OF FIGURES

Number Page1. The longest dark interval between two pulses that provides a 75%

detection probability  7

2. Summation curves (s) as a function of the interval between two

flashes of the +- case for two adapting levels 12

3. Temporal response properties of positive and negative flashes,

determined by Ikeda, 1965 13

4. Contrast thresholds for same-sign pulses and opposite-sign pulses 14

5. Summation between pulses presented at various interpulse delays 15

6. Detectability of double flashed plotted for various delay intervals t 21

7. Derivation of the sensory response function a(t) from the convolution

of the physical contrast function f(t) with the impulse-response

function g(t) 27

8. Summary of the model components shared by most linear systems

theories 30

9. Typical data observed in a digit-recall task 32

10. Impulse response functions 33

11. Theoretical components of the linear filter model of character

identification 38

12. Normalized impulse-response functions for various background

adaptation levels for the Sperling and Sondhi model 41

13. Response properties of various linear filters 42

14. Hypothetical impulse response function h(t) from Eq. 22 44

15. Perturbation of the probe response by the test pulse response 47

16. Impulse response functions for two background adaptation levels,

estimated using Roufs and Blommaert's signal perturbation

technique 49

17. Two-pulse sensitivity as a function of SOA for two phase conditions,

for different spatial frequencies 51

18. Response functions engendered by the positive-contrast pulse (top

panel), the negative-contrast pulse at a 50 ms delay (middle panel)

and the combined sensory response to both pulses presented together 52

19. Experiment 1 results for three observers 60

20. Response functions for same-sign and opposite-sign stimuli 63

21. Experiment 2 results for three observers 68

LIST OF TABLES

Number Page1. Summary of Stimulus Conditions and Results for Various Authors 20

2. Characteristics of Models of Two-Pulse Performance 54

3. Summary of Best-Fitting Model Parameters for Experiments 1-2.

RMSE's are in units of log(1/contrast) for Experiment 1 and units of

-ln(1-p) for Experiment 2 64


ACKNOWLEDGMENTS

I would like to primarly thank Dr. Geoffrey Loftus for his guidance and assistance during my graduate career. It is hard to imagine a more productive working relationship than the one we have shared. I would also like to thank Dr. Davida Teller for her assistance with the visual processing side of my research, and Dr. Earl Hunt for his assistance with the cognitive processing side of my research. Dr. John Palmer gave me invaluable help on a variety of topics, and Dr. George Wolford got me thinking about the possibility of band-pass filters in the first place.

DEDICATION

I dedicate this work to my wife Elizabeth, who's encouragement and support were invaluable for the completion of this dissertation.

Introduction

Suppose you were in charge of designing a visual system that could recognize alphanumeric characters. The inputs to the visual system would be light, photons, and the output would be some abstraction that symbolically represents the identity of the letter or number. Your first task involves capturing the photons that originated from the character. However, photons by their very nature are quantal (discrete) events. Thus, no two photons arrive at exactly the same time. A visual system that reported the locations of arriving photons with infinite temporal accuracy would report simply a flurry of pinpoints, each representing one photon capture. In order to provide a coherent representation of a scene, the visual system must combine information over time. This integration temporally groups photons coming from the same object but at different times, and provides a fundamental basis of the initial stages of object recognition: providing temporal coherence to disparate photon arrival events.

The response to a stimulus may be characterized by a temporal wave form, that represents the result of temporally grouping the arriving photons to form some response that varies over time. Upon stimulus onset this response may rise to some level, and then decay away sometime after stimulus offset. This response in the visual system characterizes some amount of response at each time t from stimulus onset, and is termed the sensory response function (Loftus, Busey & Senders, 1993; Busey & Loftus, 1994). All subsequent visual processing is based on this initial representation, and thus the first step in characterizing the mechanisms that underlie object recognition must address the nature of this initial temporal representation.

The nature of this sensory response function has previously been shown to be important for models of character identification tasks (Loftus, Busey & Senders, 1993; Busey & Loftus, 1994), intensity-duration tradeoffs in information processing tasks (Loftus & Ruthruff, 1994), completeness ratings and temporal integration performance (Loftus and Irwin, 1994), synchrony judgment tasks (Hogben & Di Lollo, 1974; Matin & Bowen, 1976) and attention tasks (McLean & Loftus, in preparation). In general the shape of the sensory response function determines the visibility of the stimulus and the duration of its persistence, which in turn affects the amount of information that can be extracted from the stimulus and the amount of interaction with subsequent stimuli such as backward masks.

Many researchers attempting to characterize the nature of the sensory response function have used a common paradigm. They present a short-duration stimulus such as a grating or a patch for a single presentation (or pulse), and then re-present the same stimulus a few tens of milliseconds later as a second pulse. The degree to which the two pulses interact provides an estimate of the duration of the persistence of the first stimulus and thus the integration interval. The time between pulses is varied to find the longest interval that still provides some evidence of interaction between the percepts of the two pulses. From this estimate of the integration interval, mathematical models are proposed that derive the shape of the sensory response function. This paradigm has been used with disks (e.g. Bouman and van den Brink, 1952; Blackwell, 1963; Purcell and Stewart, 1971), gratings (Watson and Nachmias, 1977), gaussian patches (Bergen and Wilson, 1985), two half displays (Hogben & Di Lollo, 1974) and digits (Busey and Loftus, 1994) in both detection and identification tasks.

Historically as researchers derived estimates of the integration interval, several noted that the form of the interaction between the sensations elicited by the two pulses was more complex than simple summation. Ikeda (1965) provided surprising evidence for an inhibitory interaction at some delay intervals: When two 12.5 ms pulses were separated by a 40-70 ms interstimulus interval, performance was worse than what would be expected based on probability summation of the individual responses. It appears from these data that the response to the first pulse interfered with or inhibited the processing of the second pulse.

Why might the visual system inhibit its own signal? What advantage does inhibition provide, and under what circumstances does it exist? Some clues come from the spatial domain: lateral inhibition sharpens the percept of an edge such that receptors responding to the dark side of an edge are further inhibited from responding by inhibition coming from receptors responding to the light side of an edge. We might expect to find similar mechanisms in the temporal domain, as the visual system attempts to sharpen a temporal edge caused by the appearance or disappearance of an object.

Investigating the temporal facilitory and inhibitory mechanisms that support the identification of complex patterns such as characters is the focus of this dissertation. The dependence of the information-processing mechanisms on the initial sensory representation has been previously demonstrated (Loftus, Busey & Senders, 1993; Busey & Loftus, 1994). However, an important difference exists between this work and the models that have examined the temporal mechanisms that support detection tasks. The model of character identification does not include temporal inhibitory mechanisms, while nearly all of the models of detection task performance do include temporal inhibition.

Temporal inhibition is important for two reasons. First, the degree of temporal inhibition affects how two stimuli separated in time interact, such that the second stimulus may be suppressed by the inhibitory actions of the first stimulus. This affects the visibility of the second stimulus, and thus subsequent information extraction mechanisms as well. Second, temporal inhibition improves the rate at which the visual system's response decays away, which in turn improves the visual system's temporal sensitivity and increases the ability to detect change. The information extraction processes require temporal segregation of the to-be-extracted information from irrelevant information such as a mask, and thus the ability to detect change also affects further information acquisition.

To examine the facilitory and inhibitory mechanisms that support subsequent information extraction processes in a character identification task, I provide a historical review of the two-pulse paradigm research leading up to two experiments that first suggested that temporal inhibitory mechanisms were at work. I then summarize a theory that does not assume temporal inhibition that has been successfully used to model performance in a character identification task (Loftus, Busey & Senders, 1993; Busey & Loftus, 1994). To test the adequacy of this theory, I will provide empirical evidence in a detection task and an identification task that uses character to directly address the central question of this dissertation: do the temporal summation mechanisms that subsume character identification exhibit inhibition?

In this dissertation I focus primarily on two-pulse presentation paradigms that use two presentations in the same location. Previously this has been shown to provide the best evidence for temporal inhibition (Ikeda, 1965; 1986). Outside of the scope of this work are completeness tasks (Hogben & Di Lollo, 1974), synchrony judgment tasks (Matin & Bowen, 1976), metacontrast and backward masking (Turvey, 1973), flicker fusion tasks (e.g. de Lange, 1958) and tasks involving moving images (e.g. Watson & Ahumada, 1985).

Early Data: Measuring the Persistence of a Stimulus

The initial attempts to understand the temporal processing of a stimulus focused on the persistence of a stimulus beyond it physical offset, rather than the existence of temporal inhibition. Plateau (1829; in Boynton, 1972) provided some of the first empirical evidence that a percept of a stimulus outlasts its duration, although Plateau notes that Aristotle remarked upon persistence in the form of afterimages in the third century BC. d'Arcy (1773; from Boynton, 1972) mounted a luminous stimulus on a rotating wheel, and found that a minimum of 0.133 sec was required to produce an apparent circle. van den Brink (1957) and Pollack (1953) suggested that movement can on occasion produce an increase in the temporal resolution of the visual system. However, such interactions between time and space are not well understood, and thus the present work excludes studies that employ moving images.

Bouman and van den Brink (1952) conducted the first real attempts to measure the persistence of a stimulus, by quantifying the degree to which two successive stimuli interact. For the spatially contiguous stimuli of interest here they used a dual presentation of a 1' disk 10 ms in duration presented 11° or 5° temporally from the fovea. The two flashes were separated by a variable interstimulus interval (ISI) that ranged from 10 ms to 210 ms. The two disks were either red, green, or alternated between the two colors. They adjusted the luminance of a single presentation such that it was detected about 30 percent of the time, and then measured the detection probability of the dual flash stimulus at various ISI's. The probability-of-seeing function is simply the probability of detecting the dual flash stimulus at each ISI.

From these functions and the baseline probability of detecting a single presentation they were able to compute a function that represents the degree of interaction for each stimulus type. This transformation has the effect of factoring out the differences in single-pulse detectability and provides for comparisons across stimulus type. Surprisingly they found no differences in this degree-of-interaction function for different colors or at different retinal locations. In addition, they found no evidence for inhibitory effects between the two stimuli. This second finding could have resulted from a number of factors: first, the stimuli were small, 1' in diameter. Second, their choice of colored stimuli may have influenced the inhibitory mechanisms: Uchikawa and Ikeda (1986) performed a more rigorous study on the effects of wavelength on the inhibitory mechanisms, and found no inhibition for chromatic stimuli, although they did report the expected inhibition for gray stimuli.

Persistence of a percept as measured by two successive presentations of the same stimulus was later systematically studied by Mahneke (1958). He determined the shortest-perceptible dark interval between two supra-threshold 1-degree pulses. He reasoned that if two pulses with a very short interval were seen as a single percept, then that interval was shorter than the temporal resolving power of the visual system. By varying stimulus duration and finding a interstimulus interval (ISI) that produces a percept of two distinct flashes a given percentage of the trials, Mahneke produced a curve that he claimed characterized the temporal resolution of the visual system. This curve, shown in Figure 1, demonstrates that as the duration of the stimulus increases, the length of the dark interval required to perceive the two pulses as distinct grows shorter. This has become known as the inverse-duration effect: as stimulus duration increases, the persistence of that stimulus beyond stimulus offset decreases.

There are methodological difficulties with this study, however. On each trial, Mahneke's subject would report whether they saw a single pulse or two pulses. However, it has been known for quite some time that even a single pulse can sometimes be seen as two pulses. Dunlap (1915, from Boynton, 1972) presented one or two pulses to subjects and found that for 20 ms presentations a single pulse was reported as double almost half of the time. This provides interpretational problems for two-pulse experiments, since it appears that the criterion used by observers plays an important role in determining whether they report one or two pulses. As a result of this difficulty, Kietzman and Sutton (1968) conclude that "... the presentation of two pulses of light separated by a brief and variable interpulse interval is not a sufficient set of operations to measure temporal resolution.(p. 300)" Despite this pessimistic conclusion, the finding that a single pulse sometimes appears as two pulses suggests that the visual system's response to a brief stimulus is not a straightforward facilitory response.
Figure 1. The longest dark interval (interstimulus interval, ISI) between two pulses that provides a 75% detection probability. As the duration of the two pulses increases, the threshold ISI decreases. From this Mahneke (1958) infers that the persistence of a stimulus decreases as stimulus duration increases, which has become known as the inverse-duration effect. Figure adapted from Boynton (1972).

Blackwell (1963) provides more direct evidence for temporal inhibition. He used a two-pulse paradigm with an 18' disk on two different background levels. When an 18' disk was presented for two 2.5 ms pulses separated by a variable delay, he found evidence for temporal inhibition. Evidence for this inhibition comes from the probability-of-seeing function to a two-pulse presentation that fell below that predicted by probability summation based on the response to a single pulse. This suggests that the response to the first pulse was interfering with the processing of the second pulse. This inhibition was maximal at 56 ms delay between the two pulses, and was found only when the pulses were presented on a brighter background of 10 ft. lamberts. When the same stimuli were presented on a lower background level (0 ft lamberts) no indication of temporal inhibition was found.

Direct Evidence for Temporal Inhibition: Ikeda (1965) and Rashbass (1970)

Thus far we have seen suggestive but somewhat inconclusive evidence for inhibitory effects in the temporal processing of stimuli. In part this is due to the questions asked by the individual researchers, who were interested in the duration of the persistence of a single stimulus, not the nature of that persistence. However, two studies directly address the nature of the inhibitory mechanisms.

The Summation Index s

Ikeda (1965) provided the first direct evidence for inhibition between the two pulses. To provide a quantification of the degree of facilitation or inhibition between two pulses, he developed a summation index s as:

s = log(2) - log (S1 + S2). Eq. 1

S1 is the contrast of the first pulse in a two-pulse presentation divided by the threshold contrast for the first pulse when presented alone. Likewise, S2 is the contrast of the second pulse divided by the threshold contrast for the second pulse. Thus,

S1 = L1m/L1a

Eq. 2

S2 = L2m/L2a

where L1m and L2m are the threshold radiances of stimuli 1 and 2 when they are presented together as a mixture, and L1a and L2a are the threshold radiances of stimuli 1 and 2 when presented alone. Because there are two pulses in the stimulus, the experimenter is free to vary either or both pulse luminances in the mixture until a 50% detection threshold is reached. The obtained luminances then become L1m and L2m for Eq. 2.

S1 and S2 may be thought of as a pulse contrast relative to the single-pulse threshold contrast (L1a or L2a); quantities scaled this way are often called "sensation magnitudes" (Watson, 1982). The sum of S1 and S2 provides a measure of the efficiency of the two pulses when separated in time, when compared to their individual contrast thresholds. Under conditions of perfect summation we would expect S1 + S2 = 1.0. To see why, first consider the limiting case of only a single pulse. In this case S2 = 0 and L1m would equal L1a, which gives S1 = 1.0. When the stimulus contains two pulses, S1 + S2 = 1.0 implies that the combined contributions of the two pulses at threshold are as effective as the sum of the threshold levels of the individual pulses presented alone. Because there are two pulses in the mixture, we would expect that the contrast of each pulse would be less than the individual pulse thresholds to give the same 50% detection performance. However, if the two pulses summate perfectly and the first pulse is set to half of its individual contrast threshold, then the second pulse would have to be set to half its individual contrast threshold to make the stimulus detectable 50% of the time.

Consider a more concrete example. In a given condition, L1m may be set by the experimenter to 40% the contrast of the threshold value of the first pulse (L1a) giving S1 = 0.40. The contrast of the second pulse (L2m) is then varied until the observer detects the stimulus 50% of the time. If L2m turned out to be 60% of the threshold contrast of the second pulse (L2a), then S2 equals 0.60 and we would have S1 + S2 = 1.0, which implies perfect summation of the two pulses.

If there is less than perfect summation of two pulses, S1 + S2 will be greater than 1.0, implying that the two pulses are less efficient when spread in time than they would be if superimposed. This causes s to become less than log(2). If the two flashes become sufficiently separated in time to be treated as independent events, the summation index will have a value of approximately 0.1, attributable to probability summation. A finding of the measured summation index below 0.1 implicates inhibition from the processing of the first stimulus interfering with the percept of the second stimulus.

The summation index s may take on negative values. In this case the mixture contrasts of the two pulses had to be much greater than the individual contrast thresholds in order to achieve the detection threshold, implying that S1 + S2 was much greater than 1.0.

Evidence for Temporal Inhibition

Ikeda used four types of stimuli to investigate temporal inhibition: ++, +-, -+ and --, where + indicates a positive-contrast pulse and - indicates a negative-contrast pulse These pulses were separated by various interpulse delays. In general he found that the sign of the contrast of the stimuli was irrelevant, and thus results from only two types of conditions, same-signed and opposite-signed conditions are described below.

Two successive same-signed stimuli (++ or --) with a short ISI do not summate perfectly, and when the summation index s is plotted against ISI the curve drops from a value of 0.30 (approximately log(2) ) at 0 ISI to a value of 0 at an ISI of 50-70 ms. The curve rises again for longer delays. Figure 2 shows this curve. Because the summation index drops below the probability summation prediction of 0.1, this implies that the response of the second pulse is being inhibited by the response of the first pulse.

Further evidence for inhibition comes from two opposite-signed pulses (the +- case and the -+ case). For very brief ISI's, the summation index falls below -0.4, indicating the difficulty that the visual system has in detecting a positive followed by a negative pulse. The responses to the two pulses simply counteract each other. However, as the ISI is increased to 40 ms, the summation index actually rises above 0.1, indicating that the response to the first pulse provides an facilitory effect on the response to the second pulse, with the net result that a positive pulse followed by a negative pulse 40 ms later is easier to detect than two positive pulses separated by 40 ms. Figure 2 shows the summation index for +- case and for the ++ and -- cases.

The summation index as a function of ISI shown in Figure 2 for the two types of stimuli (++ and +-) are complementary: The ++ and -- cases show inhibitory effects at ISI ranges between 40-70 ms, and the +- and -+ cases show facilitory effects in the same range. Thus Ikeda proposed that the initial stages of visual processing acted as a temporal filter. A filter in general passes some types of information and blocks others, and in this case the temporal filter was passing slow-changing information while failing to represent rapidly-changing information. The result is a filter that temporally blurs the visual input, such that a fast-changing stimulus may be seen as a stable image.

Based on the results shown in Figure 2 and his intuition of a temporal filter, Ikeda proposed that the response to each pulse was biphasic. This response has a facilitory lobe until 40 ms following stimulus onset, and a negative lobe until approximately 120 ms following stimulus onset. A negative pulse produced a temporal response that was a mirror image of the positive case: it has an inhibitory response for the first 40 ms, followed by an facilitory response until 120 ms following onset. Figure 3 shows these functions, which are meant to represent some hypothetical response in the visual system, such as rate of neuronal firing per second. Although Ikeda intuited that the initial stages of visual processing could be modeled by a linear temporal filter and provided these rough sketches of the temporal responses to his 12.5 ms pulses, he was unable to derive a more fundamental representation of the temporal response properties, the impulse response function. This function will be discussed further below in the description of linear-filter modeling.
Figure 2. Summation curves (s) as a function of the interval between two flashes of the +- case for two adapting levels. Solid lines and crosses are for background levels of 328 trolands and dashed lines and open circles are for background levels of 61.2 trolands. The results from the ++ case at the same adapting levels are shown as well. For very brief ISI's, the summation index falls below -0.4, indicating the difficulty that the visual system has in detecting a positive followed by a negative pulse. The responses to the two pulses simply counteract each other. However, as the ISI is increased to 40 ms, the summation index actually rises above 0.1, indicating that the response to the first pulse provides an facilitory effect on the response to the second pulse, with the net result that a positive pulse followed by a negative pulse 40 ms later is easier to detect than two positive pulses separated by 40 ms. Figure adapted from Ikeda 1965.

Figure 3. Temporal response properties of positive and negative flashes, determined by Ikeda, 1965. These represent derived temporal responses to single pulses. These functions are meant to represent some hypothetical response in the visual system, such as rate of neuronal firing per second. Figure adapted from Ikeda, 1986.

Following Ikeda's work, Rashbass (1970) developed a mathematical theory to model the degree of inhibition present between two successive stimuli. He adopted Ikeda's notion of a temporal filter, and suggested that the visual system takes a running average of the variance of the filter output. Threshold is reached if this average achieves a certain level. The variance is a useful quantity, because the variance represents change in the visual processing; monitoring the size of the variance makes the visual system a change detector. If the variance goes up, something is happening in the visual field.

The characteristics of the filter can be inferred from two-pulse data as well as flicker fusion data. This model is a more rudimentary application of temporal filters than the model developed simultaneously by Sperling and Sondhi (1968). In a separate section I review the Sperling and Sondhi model, but because of the clarity of the predictions and the relatively intuitive nature of the Rashbass model, I give an overview of his model below.
Figure 4. Contrast thresholds for same-sign pulses and opposite-sign pulses. Note how sensitivity to opposite-signed pulses is greater than same-signed pulses at 70 ms ISI. Curves are from Loftus and Busey's linear-filter model described below. Figure adapted from Rashbass, 1970.

Consider a two-pulse experiment in which the contrast of the two flashes is equal, and contrast is varied to find a performance threshold (say 75% detection rate). Threshold contrast is then plotted against the interpulse delay interval. Figure 4, with data from Rashbass (1970), shows these functions for 2 ms duration pulses.

This paradigm is not limited to pulse-pairs of equal contrast. The contrast of the first pulse can be fixed, and the contrast of the second pulse varied to find a contrast threshold. Once a threshold performance level has been reached, the final two contrasts of the two pulses become x-y coordinates in a two-dimensional plot. This paradigm is then repeated for many different first-pulse intensity increments and for positive and negative intensity increments. When the points are connected, they form an iso-threshold curve that may be approximated empirically by an ellipse, centered at zero, with minor and major axes on 45° angles. Figure 5 shows these functions for various delay intervals, with one plot for each delay interval. The curves are empirically fit ellipses.

If A and B are the intensity values used to give 75% detection performance for a given pulse paring, then the equation of the ellipse is given by

A2 + B2 + 2ABLT = 1 (-1 ² LT ² 1) Eq. 3

where LT is a function of T, the interval between the two pulses, and the unit of measure for A and B is the intensity at threshold of a single pulse. LT represents the nature and magnitude of the interaction between the two pulses. This function provides a good empirical approximation to the threshold intensity data, but it also emerges from a mathematical development of the temporal response properties of the system. The curves from Figure 5 above are derived from Eq. 3, with the eccentricity of the ellipse at each delay interval given by LT.

The Rashbass model is fairly straightforward, and provides a basis for understanding all modern linear-filter modeling. A single pulse presented to the visual system will be transformed by a filter that is assumed to be linear. The wave form of this output need not be specified except to represent it as F(t). This wave form is then squared and integrated over the interval of interest:
Figure 5. Summation between pulses presented at various interpulse delays. Numbers represent interpulse delay for that set of data. Pulses were 2-ms in duration, presented on a background of 700-td. The abscissa and ordinate in each graph represent the threshold intensities for each pulse. Different points were determined as follows: at a given delay, say 10 ms, the luminance of one pulse was fixed and the luminance of the second pulse was varied to achieve a 75% detection rate. These two threshold luminances become a point on the 10 ms plot. This procedure is repeated for a variety of pulse 1 luminances, both positive and negative, to fill out the rest of the 10 ms plot. The curves are ellipses represented by Eq. 3. The eccentricity of the ellipse is given by LT, as described in the text. Adapted from Rashbass, 1970.

S = Eq. 4

where t is chosen to be large enough such that F(t) falls essentially to zero. Threshold is reached (the stimulus is correctly detected) when S reaches some constant criterion value Sc. After filtration by the visual system, a single pulse of magnitude A gives a response Af(t) which is the response of the filter (f) scaled by the intensity of the pulse (A). If we define Sc to be 1 then we have:

Eq. 5

for a single pulse at threshold. If a second pulse of intensity B follows a first pulse of intensity A by T ms, the combined output of the filter is given by

Af(t) + Bf(t-T) Eq. 6

Substituting Eq. 6 for Af(t) in Eq. 5, we have

Eq. 7

After some algebraic manipulation (see Rashbass, 1970), we find:

Eq. 8

Only the integral depends upon the characteristics of the filter (and it depends on nothing else), and thus we can equate from Eq. 8 with LT from Eq. 3. This completes Eq. 3 as a mathematical model, not merely a description of the data. The theory began with the assertion that the output of the filter is squared and integrated, which produces a model that predicts an ellipse. The elliptical prediction falls out of the initial squaring operation in Eq. 5. The major axis of the ellipse depends solely on LT, which in turn depends upon T, the interpulse interval. LT may be thought of as the degree to which the response to the first pulse interacts with the response to the second pulse.

The ellipses provided by Eqs. 3 and 8 have some additional characteristics that make this model appealing. First, ellipses are symmetric by rotation of 180° around the center, which means changing the sign of both of the pulses has no effect on the threshold (i.e. a ++ threshold is identical to a -- threshold, and the +- and -+ thresholds are the same). This fits with a number of observations (Ikeda, 1965; Rashbass, 1970; Watson and Nachmias, 1977; Meijer, van den Wildt and van den Brink, 1978). Second, symmetry by reflection in the line A = B means that the two flashes may be interchanged without changing the threshold. Thus the (A, B) threshold will be identical to the (B, A) threshold, indicating that the model is independent of sequence. This agrees both with Rashbass' data, but data from Watson and Nachmias (1977) as well. However, this is contradicted by Ikeda (1986), who found no such symmetry.

The squaring operation in the model has some intuitive characteristics as well. Basically the model states that the visual system keeps a running total of the squared deviations of the signal from the mean luminance, which is simply the variance, and responds when this variance rises. The variance of a signal increases when the signal undergoes change, and thus this model responds to changes in the stimulus, or, more specifically, its response is proportional to the amount of change in the stimulus.

Rashbass tests this model in a number of different paradigms, using different types of stimulus wave forms. In all cases he confirms the model, and uses LT as the index of summation between the two pulses. From this he derives summation curves as a function of ISI that are quite similar to the s summation curves of Ikeda (1965). LT represents the inverse Fourier transform of the amplitude response of the filter, which is the autocorrelation function . Unfortunately, phase information is lost in the transformation of a function to its autocorrelation, and thus, like Ikeda, Rashbass is unable to derive the form of the impulse-response function. Thus Rashbass is limited to saying that the impulse-response function exists and that it has a negative-going lobe at some 50-60 ms beyond stimulus onset. In support of the model, the transform of LT into the temporal frequency domain does resemble the temporal-contrast sensitivity function (TCSF) (Watson, 1986).

Variables Affecting Temporal Inhibition

Since the work of Ikeda (1965) and Rashbass (1970), a number of different researchers have investigated the influence of a variety of different variables on temporal inhibition. These can be summarized as the effects of background luminance, stimulus size and the sign of the pulses. Table 1 summarizes these findings, which are discussed in detail below.

Effects of Background Level

Blackwell (1963) found evidence for temporal inhibition only for stimuli shown at a background luminance of 10 ft-lamberts. Roufs (1973) used 1° stimuli on three different background levels. He found an inhibitory effect for stimuli shown at background levels of 42 trolands and 1200 trolands, but not at 1 troland. Thus it appears that inhibition depends upon the background level, although the critical background level is not fully defined. As the visual system becomes exposed to higher and higher light levels, the integration period grows shorter and shorter. This fits with the intuitive model that a visual system should integrate over longer periods in low light levels to gain sensitivity, but integrate over shorter intervals in higher light levels to gain responsiveness to changes in the visual scene.

Effects of Stimulus Size

Meijer, van den Wildt and van den Brink (1978) provided the clearest evidence that temporal inhibition depends upon the size of the stimulus. They performed a parametric variation of disk size and delay interval for two same-sign pulses. These findings are reproduced in Figure 6, which show no inhibition at small stimuli, but clear evidence for lateral inhibition as determined by the notch in the curves for larger stimuli.
Table 1. Summary of Stimulus Conditions and Results for Various Authors
Authors
Test size
Adapting Level
Inhibition
t = temporal delay of inhibition
t = critical duration
Conditions
van den Brink & Bouman (1952)
1'
0-20 ml
no
t = 80 ms

decreases as adaptation increases
++
Blackwell (1963)
18'
0 ft-l

10 ft-l
no

yes
--

t = 56 ms
20 ms
++
Ikeda (1965)
30'
61.2 td

328 td
yes

yes
t = 70 ms

t = 53 ms
t = 10-20 ms
++, +-, -+, --
Rashbass (1970)
17°
700 td
yes
t = 66 ms
++, +-, -+, --
Purcell & Stewart (1971)
38'
25 ft-l
yes
--
Theodor (1972)
10'
25 ft-l
yes
--
++
Roufs (1973)
1 td

42 td

1200 td
no

yes

yes
--

t = 56 ms

t = 32 ms
++
Breitmeyer & Ganz (1977)
1.0 c/deg

10 c/deg
5.0 ft-l
yes

no
t = 80 ms
Gratings:

++
Watson & Nachmias (1977)
2° x 1.5°

1.8 c/deg

3.5 c/deg

7.0 c/deg

10 c/deg
15 cd/m2

yes

yes

no

no

t = 60 ms

t = 60 ms
Gratings:

++, +-, -+, --
Meijer, van den Wildt & van den Brink (1978)
5.5'

11'

11' --

13.7'

66' --

96'
120 td
no

no

no

yes

yes

yes
--

--

--

t = 70 ms

t = 70 ms

t = 70 ms
++ and some --
Roufs (1981)
0.8'

1200 td
no

yes
--

t = 60 ms
Musselwhite & Jeffreys (1983)
grid of dots 16'
500 cd/m2

2.5 cd/m2
no

no
recorded VEP's

++, +-, -+, --
Bergen & Wilson (1985)

0.125°
17.2 cd/m2
yes

yes
t = 50 ms

t = 50 ms
Difference of gaussian: +++, -+-
Uchikawa & Ikeda (1986)
45'

chrom.

achrom.
0 cd/m2
no

no

yes
--

--

t = 40 ms
++, +-, -+, --

for chromatic & achromatic stimuli
Notes: Adapted from Ikeda (1986), Boynton (1972) and individual articles. Some paradigms allow an inference of inhibition or lack of inhibition, but do not provide estimates of t. Later works do not report the critical duration t, as they asked different empirical questions.

Figure 6. Detectability of double flashes plotted for various delay intervals t. Parameters are stimulus size in arc min. Adapted from Meijer et al, 1978).

Breitmeyer and Ganz (1977) used sine-wave gratings to examine the effect of spatial frequency on temporal inhibition. They found evidence for temporal inhibition at an 80-100 ms delay only for low spatial frequency stimuli (1.0 c/deg). They attribute this inhibition to the transient channels of the visual system. They found no evidence for temporal inhibition with higher spatial frequency stimuli (10.0 c/deg).

Watson and Nachmias (1977) found a similar result using both in-phase and out-of-phase sine-wave gratings at various spatial frequencies. Using gratings on a 120 td background field, they found evidence for temporal inhibition at delay intervals of 70 ms for gratings that had a spatial frequency of 1.75 c/deg and 3.5 c/deg. No evidence for inhibition was found for gratings of higher spatial frequencies (7.0 and 10.5 c/deg). Roufs (1981) replicated these results, finding temporal inhibition with a 1° stimulus but not for a 0.8' stimulus.

Based on a meta-analysis of these findings, Ikeda (1986) concluded that inhibitory phenomena show up only when the stimulus size becomes larger than about 13 min of arc. However, the temporal delay between the pulses at which inhibition is largest is not affected by the size of the stimulus beyond 13 min of arc, implicating lateral inhibitory mechanisms as suggested by Purcel and Stewart (1971).

Effects of Positive- vs. Negative-Contrast Stimuli

Blackwell (1963) was the first to note that thresholds for positive-contrast stimuli were almost identical to thresholds for negative-contrast stimuli. This finding was replicated by the elliptical data found by Rashbass (1970, see Figure 5) and Watson and Nachmias (1977). Thus a positive-positive pulse pair has the same contrast threshold as the absolute value of the contrast threshold of a negative-negative pulse pair. This implicates contrast or changes from the background luminance, rather than absolute luminance, as the important variable for detection performance, and suggests that the visual system rectifies the signal at some point prior to making a detection determination. However, absolute luminance does bear upon the characteristics of temporal inhibition.

Linear Filter Models of Two-Pulse Detection Data

The models of Ikeda and Rashbass provide a springboard for the modern linear filter models that have been proposed to account for many of the relationships between stimulus variables and temporal sensitivity summarized in Table 1. The next three sections contain an overview of the methodology used to refine and test these linear filter models, a description of the representation of a physical stimulus, and an overview of the different components of a linear filter model. This last section is designed to provide a discussion of the components that many models share; specific differences between the individual models will be discussed in later sections. Virtually all linear-filter models in the literature are designed to predict performance in detection tasks, and thus the general overview provided below is particularly germane to detection tasks. The linear filter theory that I have been using to model performance in a more cognitive character identification task (Loftus, Busey & Senders, 1993; Busey & Loftus, 1994) differs from the other models, and these differences will be discussed in a summary of this character identification theory.

Overview of Methodology

The visual system can be systematically studied as a physical system, much like physical systems are studied in engineering contexts. Researchers apply inputs in the form of visual stimuli, and the resulting outputs can be measured. The inputs are real-valued functions of time that typically describe the contrast or luminance of the stimulus over time. The outputs are also considered to be real-valued functions, representing some amount of activity within the visual system. In many cases the level of modeling consists of functions that are meant to represent the end product of a number of individual internal processes; as such it would not be reasonable to anticipate a direct neurophysiological analog to model components. Nevertheless, the model is typically able to provide an overall description of the behavior of the system.

Typically the responses acquired during a psychophysical experiment are keypresses, not real-time activation functions. Thus the internal workings of the model that are assumed to represent hypothetical internal behavior of the system must be inferred from these binary responses. For example, one portion of the theory might model the amount of activity that exists in the visual system in response to a visual stimulus. Presumably the observer uses some aspect of this activation to make a keypress. Theories of performance in these two-pulse paradigms typically parallel this dichotomy, with a model component that represents the amount of internal activation, as well as a component that represents the decision made by the observer based on this activation level. This allows the theory to make specific behavioral predictions that can be directly compared to the obtained data.

A complete model must contain a characterization of the physical stimulus (e.g. its contrast or spatial frequency components) and describe how the visual system represents that stimulus. The model must then describe how the observer uses some aspect of this representation to make a behavioral response. The following sections describe these model components.

Characterization of the Physical Stimulus

A visual stimulus may be characterized as changes in the intensity of light over space and time. Because we are only rarely concerned with the wavelength of the light, it is usually sufficient to characterize the stimulus as f(x, y, t), where x and y are horizontal and vertical coordinates, and t is time. The function f represents the intensity, or more often the contrast of the stimulus. Many models of temporal summation concentrate only on the temporal aspects of the stimulus, ignoring the spatial components, and thus the function f depends only on time: f(t). This representation is adopted here, since only a few models of spatially-separate stimuli will be discussed.

When f(t) represents the contrast of a pulse (or rectangular) stimulus, this function is zero prior to stimulus onset, equals the contrast of the stimulus during stimulus presentation, and becomes zero again at stimulus offset. For the stimuli of interest here, a single pulse may be defined as:

0 t < 0, t > d

f(t) = Eq. 9

F 0 ² t ² d

where F is stimulus contrast, d is the stimulus duration, and t is time since stimulus onset. Variants of this rectangular function are permissible, although such functions are easily generated and easily modeled.

An Overview of Linear Filter Model Components

To completely characterize a system a theory must include a specification of the stimulus in terms of theoretically relevant units, and provide a prediction of the resulting output. The range of possible inputs is infinite; however, certain assumptions about the behavior of the system, in particular that it is linear, can make this problem tractable.

Linear filter models typically incorporate three different components. The first component assumes that the visual system acts as a low-pass linear filter, that can be characterized by a candidate impulse response function. This function, which determines the temporal sensitivity of the filter, is then used to derive a hypothetical response in the visual system that corresponds to the representation of a given stimulus presentation. Some representation analogous to the system response is assumed to exist in the visual system, but this response alone is not sufficient for the observer to make a response.

The second component of most models is a process of rectification, transduction, and integration. Together these provide further processing of the visual signal, to account for various relationships summarized in Table 1.

The output of this integration process is then used for further perceptual processing, and to make a response. Linear filter models of detection performance assume that if the output of the integration component exceeds some internal threshold value, then a "yes" response is made by the detector component. Other models assume a more complex information extraction process based on other model components and mechanisms.

More specific details of each of these three components are provided in the next section.

Generating the Visual System's Response: the Sensory Response Function

The visual system's initial response to a stimulus depends on two factors. The first is the stimulus contrast function, as defined by Eq. 9. The second is the temporal characteristics of the low-pass filter that is assumed to represent the initial stages of stimulus processing. These can be described by an impulse response function, which is the visual system's response to an impulse. Along with an assumption of linearity in the filter, the stimulus contrast function f(t) and the impulse response function uniquely define the sensory response function for any arbitrary stimulus contrast function f(t).

To derive the visual system's response to any arbitrary stimulus, define g(t) as the impulse response function that represents the temporal response characteristics of the low-pass filter. Authors of various models have chosen different forms of this impulse-response function, and an example impulse response function is shown in the top panel of Figure 7. The sensory response a(t) results from the convolution of g(t) with f(t):

a(t) = f(t) * g(t) = Eq. 10

A graphical representation of this convolution process is presented in Figure 7. The physical stimulus f(t) may be broken down into a series of impulses, or very brief flashes, each of which is infinitely brief in duration, infinitely intense, and has unit area. When an impulse is presented to the low-pass filter, an impulse-response function results. Thus the impulse-response function provides a characterization of the temporal response properties of the low-pass filter. For the purposes of exposition the impulse may be approximated by a 10 ms presentation, although none of the linear filter models make this approximation. The lower-left panel of Figure 7 shows a 40 ms presentation broken down into four 10-ms impulses. Each of these impulses will engender an impulse-response function; these are shown as the small curves in the lower-right panel of Figure 7. The sensory response function a(t) is simply the sum of these four impulse response functions.

Several points should be noted here. First, the assumption that the system is linear implies that the system obeys two principles (Watson, 1986). If the action of the system is denoted by an operator L, then a linear system conforms to

L[a f(t)] = a L[f(t)] Eq. 11

and

L[ f1(t) + f2(t)] = L[ f1(t)] + L[f2(t)] Eq. 12

where f1 and f2 are any two inputs. Eq. 11 implies that if, for example, we double the contrast of the stimulus, the response will be doubled as well. Eq. 12 implies that the response to two inputs is the sum of the individual responses to the two inputs presented separately.

A more general conclusion can be made from Eq. 12. Since the system is linear, Eq. 12 implies that the low-pass filter output (the sensory response function) to any arbitrary stimulus contrast function can be determined simply by convolving the impulse response function of the filter with the stimulus contrast function f(t), as shown by Eq. 10. Thus despite the infinite range of possible, only a few judiciously-chosen stimuli are required to test any given model.

Figure 7. Derivation of the sensory response function a(t) from the convolution of the physical contrast function f(t) with the impulse-response function g(t). Top panel: an impulse and an example impulse-response function. Lower-left panel: the physical stimulus contrast function broken down into four impulses. Lower-right panel: the four impulse-response functions resulting from the four impulses. The sensory response function a(t) is simply the sum of the four impulse response functions.

The linearity assumption provides an important implication for models that assume a monophasic impulse response function, that is, a non-negative impulse response function. For this type of filter, a fundamental consequence of linearity is that the area under the f(t) function will always be equal to the area under the a(t) function. Thus the area under the f(t) function in Figure 7 will always equal the area under the a(t) function, as long as a monophasic impulse response function is assumed.

Rectification, Transduction and Integration of the Sensory response

The convolution of the stimulus contrast function f(t) with the impulse response function g(t) provides the sensory response function a(t), which is assumed to represent a hypothetical response within the visual system. Before this response can be used to predict a behavioral response, it must be transformed. First, this signal must be rectified, to account for the finding that thresholds for positive- and negative-contrast stimuli are usually quite similar.

Second, this signal is transduced, meaning that prior to further processing the sensory response function is often assumed to be raised to an exponent. This exponent is often assumed to be 2 for those researchers assuming a square-law transduction process (Rashbass, 1970; Nachmias & Sansbury, 1974; Carlson & Kopfenstein, 1985), or in the range of 4-8 for a higher power transduction (Watson, 1978; Legge, 1980). However, linear transduction (an exponent of 1) has also been applied (Sachs, Nachmias & Robson, 1971; Graham, 1977; Busey & Loftus, 1994). These models all make different assumptions about how the sensory response function is used to make a response when applied to different tasks. Thus it is not surprising that different researchers adopt different exponents.

Finally, the rectified and transduced sensory response function is integrated, to compute some measure of the total response to the stimulus. The result of this integration process is used in the detector component, which is discussed below.

Further Processing: Information Extraction and Decision

The various models proposed by researchers working with linear filter models differ most in how they convert some aspect of the sensory response function into a response. The Loftus and Busey model of character identification assumes that information is extracted at a rate that is defined by both the magnitude of the sensory response function as well as a random-sampling model that assumes that information is acquired at random and with replacement. This extracted information is then used to make a response.

Alternatively, researchers studying detection tasks typically assume that an internal threshold exists such that if the sensory response function or the integrated sensory response function exceeds this threshold, the observer responds "yes" in a detection task, otherwise the observer responds "no" on that trial.

Figure 8 summarizes the model components in a flow-chart that demonstrates how the physical stimulus is represented by the visual system and how this representation is subsequently used to make a response. Note that not all models follow this theoretical construction; specific deviations from this structure will be noted below.
Figure 8. Summary of the model components shared by most linear systems theories. The physical stimulus is represented by changes in contrast over time, and when this stimulus is passed through a low-pass filter at the initial stages of the visual system, the stimulus representation becomes temporally blurred. This representation is then rectified, transduced and integrated. Different theories assume different forms of processing at this point; however, all use some aspect of the sensory response function to predict behavior.

A Linear-Filter Model of Character Identification Data

Most linear-filter models are applied to detection tasks, and simply predict whether an observer will detect or fail to detect a simple stimulus such as a disk or a grating. A more complex task such as character identification requires some additional assumptions about how information is acquired from the stimulus. I have been using a such model originally proposed by Loftus (Loftus, Duncan & Gerhig, 1992; Loftus, Busey & Senders, 1993; Busey & Loftus, 1994) that has accounted for performance on a digit recall task under conditions of variable contrast (Loftus & Ruthruff, 1994), variable duration (Loftus, Busey and Senders, 1993), variable ISI (Busey & Loftus, 1994), and monoptic and dichoptic presentation (Busey & Loftus, in preparation). In addition this theory has been used to account for various aspects of temporal integration using completeness ratings and temporal integration performance (Loftus and Irwin, 1994). Below I describe the task that we have been using to refine this theory, and summarize the theory's components.

A Digit-Recall Task

The task that we have been using is complex enough to provide generalization to everyday tasks, but simple enough to provide an avenue for investigating the relationships between a number of different variables. Observers view a short-duration low-contrast display of four digits presented simultaneously on a computer screen, with the intent of remembering them and typing them into a keypad. The stimulus duration is typically varied, ranging from about 15 ms to around 200 ms, and the contrast of the digits is around 5%. Performance is computed as the proportion of the correctly-recalled digits for each stimulus duration, and corrected for the 10% guessing rate. When performance is graphed as a function of stimulus duration, a performance curve like that shown in the left panel of Figure 9 is typically observed.

Figure 9. Typical data observed in a digit-recall task. Left panel: performance as a function of stimulus duration. Right panel: transformed performance curve to evaluate deviations from linearity.

The empirically-determined relationship between stimulus duration and performance can be summarized with the following equation (Loftus, Duncan & Gehrig, 1992),

0 for d < L

p = Eq. 13

for d ³ L

which is illustrated in the top panel of Figure 9. Here d is exposure duration, and Y, L and cr are free parameters: cr is the exponential growth constant for the regression model, L (for "liftoff") is the maximum duration that gives chance-level performance (that is, the duration at which performance "lifts off" from chance); and Y is asymptotic performance. Here we assume that Y is 1.0, because the four-digit stimuli in our task are easily within the span of short-term memory.

Given that performance curves can be adequately described by Equation 13, it is convenient to define a new dependent variable, P, as: P = ­ln (1.0 ­ p/Y). With P as the performance measure, Equation 13 can be rewritten as,

0 for d < L

P = Eq. 14

d/cr ­ d/L for d ³ L

Thus, in terms of P, post-liftoff performance is linear with duration with a slope of 1/cr and a d-intercept of L ms, as illustrated in the right panel of Figure 9.

In addition to evaluating deviations from linearity, the transformed dependent variable P provides a number of other advantages over the more standard proportion correct dependent variable. First, the logarithmic transformation makes probability summation predictions easier to visualize, because the probability summation equation becomes additive in P. Second, because the model is specific about the relation between stimulus variables such as contrast and duration and subsequent performance, we can make quantitative predictions. Thus it is meaningful from a measurement standpoint to talk about sizes of performance differences across different levels of an independent variable, and this transformation makes such evaluations simple. Finally, such a transformation will have to be made, either on the data as we have done, or on the theoretical predictions to make them correspond to proportion correct data. We have chosen to transform the data to make this transformation explicit, rather than a component of the model.

Eqs. 13 and 14 summarize the relationship between stimulus duration and performance, but this summary is empirical rather than theoretical. A more complete theory based on the linear systems approach is described below.

Theoretical Overview

The character identification theory has three components. The first is the initial representation of the stimulus, termed the sensory response function. The second component is a sensory threshold, such that further processing of the stimulus does not occur until the sensory response function exceeds this threshold. The third component is a function that describes the instantaneous rate of information acquisition, termed the acquisition rate function. Together these three components compose a mathematical model that provides quantitative predictions for the digit recall task. Each of these components is discussed in detail below.

Initial Stages of Processing: The Sensory Response Function

The character identification theory uses a linear-filter front end to generate the initial representation of a stimulus, called the sensory-response function . The characteristics of this linear-filter front end can be described by an impulse response function. The choice of candidate impulse response functions differs from model to model, and we have previously chosen an impulse response function to be a gamma function of the form,

g(t) = Eq. 15

where n and t are free parameters: n is a positive integer, and t is a positive real number. t represents time since stimulus onset. This function can be interpreted as representing the output of an n-stage system where the input to Stage 1 is the stimulus, the input to each of Stages 2 through n is output of the previous stage, and the output of each stage decays exponentially with decay constant t. The parameter n is usually set to 9 to correspond with physiological data, and t ranges from 4-10 ms.

It is important to note that g(t) is always non-negative, because t, t and n are always positive. This implies a monophasic impulse response function that has no temporal inhibition. Despite this restriction on the form of the impulse response function, this version of the theory has accurately predicted performance in over a dozen experiments.

The impulse-response function g(t) is used in conjunction with the stimulus contrast function f(t) to provide the sensory response function a(t), which for rectangular contrast functions becomes:

fG(t) t ² d

a(t) = Eq. 16

f[G(t) ­ G(t-d)] t > d

where G(x) is the integral of g(x) from 0 to x.

Monophasic and Biphasic Impulse Response Functions and Temporal Inhibition.

The focus of this dissertation is the temporal inhibition that occurs in character identification tasks. This temporal inhibition is modeled in linear filter theories by an impulse response function that dips below zero. Such an impulse response function is biphasic or multiphasic, as opposed to the monophasic impulse response function used previously to model character identification tasks (Loftus, Busey & Senders, 1993; Busey & Loftus, 1994). Figure 10 shows examples of monophasic and biphasic impulse response functions.

Figure 10. Impulse response functions. Left panel: a monophasic impulse response function generated by Eq. 15. Right panel: a biphasic impulse response function given by Watson's working model, as described in a later section.

Information Loss: The Sensory Threshold

The second component of the theory is a sensory threshold q, that is assumed to exist in the visual system such that further perceptual processing of the stimulus representation a(t) doesn't occur unless a(t) > Q. Thus, an effective sensory-response function may thus be defined as,

a(t) ­ Q a(t) > Q

aQ(t) = Eq. 17

0 a(t) ² Q

where Q represents the sensory threshold in units of stimulus contrast.

As defined, the sensory threshold q represents an information loss and a non-linearity in the theory. This threshold is conceptually quite different than a detection threshold, which is a statistical concept representing a certain performance level. q is also distinct from the decision thresholds used in detection linear filter models, which assume that the stimulus is detected if some representation within the model exceeds the detection threshold.

Acquired Information: The Information Extraction Rate

Once the sensory response function a(t) exceeds the sensory threshold q, information is extracted at some acquisition rate that is termed r(t). This function represents the instantaneous rate of information acquisition, and is proportional to a) the above-threshold sensory response and b) the proportion of remaining to-be-acquired stimulus information. The first part of this expression, the above-threshold sensory response, is given by Eq. 17. For the second half of the expression, define I(t) as the proportion of information acquired by time t. The proportion of remaining stimulus information is simply [1.0 - I(t)]. The acquisition rate function r(t), which is by definition the derivative of I(t) with respect to time, is defined as,

r(t) _ = aQ(t)  ­  Eq. 18

where cs represents the constant of proportionality and completes the equality. The model parameter 1/cs also represents the rate at which new (i.e., previously unsampled) features are sampled, although the overall rate of feature acquisition decreases over time as I(t) rises (since 1.0- I(t) decreases).

Busey and Loftus (1994) demonstrated that, with this rate function, the equation relating total acquired information, which we designate I(_), to the above-threshold area under aQ(t), AQ(_), becomes,

I(_) = . Eq. 19

Figure 11 summarizes the three major components of the model: the stimulus input wave form, f(t), the resulting sensory-response function, a(t) and the information-acquisition rate function, r(t).

Performance Predictions

In order to make quantitative predictions the theory requires one additional assumption. Total acquired information I(_) is assumed to equal p, the proportion of correctly recalled digits. Thus,

p = Eq. 20

Note that, in terms of P = -ln (1.0 ­ p),

P = AQ(_)/cs Eq. 21

Equation 21 summarizes an important prediction: that performance is directly proportional to the above-threshold area under the a(t) function. This relationship is a consequence of Eqs. 15, 16, 18 and 20, and is not an assumption of the theory.
Figure 11. Theoretical components of the linear filter model of character identification.

Summary

The linear-filter model of character identification has been used to account for performance in a number of different paradigms and under a variety of conditions. For the present work, an important feature of this theory is the monophasic impulse-response function given by Eq. 15, which implies no temporal inhibition. Other researchers have chosen different candidate impulse response function, and the next section reviews these models in order to compare the different choices of impulse-response functions.

Modeling Inhibition: Linear-Systems Theories of Detection Thresholds

Researchers have proposed a variety of models to account for the temporal inhibition observed in two-pulse pattern detection experiments. This section summarizes these models, with particular attention paid to the nature of the definition of the impulse response function. In each case this definition determines if and how much temporal inhibition is predicted by each model.

The Linear-Filter Model of Sperling and Sondhi

Sperling and Sondhi (1968) describe a general model of visual temporal processing that predicts many of the characteristics summarized in Table 1. As described below, this model will account for the inhibitory response seen at a 40-70 ms delay interval, as well as the changes in inhibitory effects with the level of adapting field illumination. The spatial effects described in Table 1 are outside the scope of Sperling and Sondhi's model. However, this model does provide a candidate impulse-response function.

The Sperling and Sondhi model consists of three types of stages, which have resistor-capacitor (RC) units as its basic units. These RC units have time-constants that determine the nature of their response to a given input, and these time-constants are parametrically controlled in some instances. This parametric control of time-constants allows the behavior of the system to change with the adaptation state, as seen in Table 1. The mathematical development of the model is somewhat complex, and thus an intuitive explanation of the different types of filters and the overall behavior of the model is given below. A more rigorous mathematical development will be given to more current models of the temporal response properties of the visual system.

The first type of stage in the model is a low pass linear filter, LP, that controls the temporal resolution of the system. The result is a temporal blurring the visual input, such that a fast-changing stimulus may be seen as a stable image. These filters are attributed to the synapses or long dendritic paths in the neurophysiology of the visual system.

The second type of filter is a parametric-feedback filter, FB, that acts to compress the dynamic range of the input. These reduce the overall time-constant of the system when the adaptation state is light-adapted, by controlling the time-constant of the low-pass filter LP. Thus when mean luminance increases, the time-constant of LP is reduced, causing the system to integrate over shorter time intervals. These filters account for the finding in Table 1 that the temporal delay at which inhibition becomes evident depends upon the adaptation level. These filters are thought to exist at or near the level of the receptors in the bipolar level, and the controlling signal is passed through the horizontal cells.

The third type of filter is a delayed feed forward filter, FF. This filter converts the system to a Weber-law system. This allows the model to become sensitive to the ratio of a change relative to a steady background, rather than to the absolute magnitude of the change in luminance units. The feed forward delay represents an output that has been compared to the time-average of recent inputs, and thus makes the system a change detector. Sperling and Sondhi implicate the bipolar-ganglion cell interface in this filter, with the control signal going through the amacrine cells.

The final component of this model is a binary yes-no detector, that responds 'yes' whenever the signal exceeds some response threshold ±e. This response threshold is assumed to have no frequency dependence or statistical uncertainty.
Figure 12. Normalized impulse-response functions for various background adaptation levels for the Sperling and Sondhi model. Background luminances range from 2*10-5 to 2*105 in 1-log increments. The response is monophasic for 2*10-5 cd/m2, becomes slightly biphasic at around 20 cd/m2, and has an increasingly larger inhibitory lobe for higher background luminances. Adapted from Di Lollo and Bischof, submitted.

Together these three types of filters account for many of the observations of Table 1, with the exception of the spatial dependence of the inhibitory effects. Sperling and Sondhi computed the impulse-response functions given by the model for various background adaptation states, which are reproduced in Figure 12.

In the impulse response functions of Figure 12 we see that as adaptation increases, the level of inhibition increases and the time since stimulus onset for which inhibition begins decreases. Therefore this model correctly predicts that as adaptation luminance increases, the delay interval at which inhibition is detected in two-pulse paradigms will decrease. The delayed feed-foreword filters of the model also insure that the thresholds for ++ and -- pulses will be the same, as will the +- and -+ pulse thresholds.

Sperling and Sondhi did not apply their model specifically to two-pulse stimuli, although they did model flicker-detection tasks. The model predicts no inhibition for background luminance levels below 30 cd/m2 since temporal sensitivity was identical for low and moderate temporal frequencies. However, as background luminance increased the amplitude response of the observers to lower temporal frequencies drops off, implying a band-pass rather than a low-pass filter. Such a band-pass filter necessarily has an impulse response function that dips below zero, and thus contains an inhibitory lobe. Figure 13 shows the frequency response functions and impulse response function for low-pass and band-pass filters.

To the degree to which flicker-detection tasks and two-pulse detection tasks depend on similar mechanisms, the Sperling and Sondhi model predicts the inhibitory responses to two-pulse stimuli which are summarized in Table 1. Qualitatively this model accounts for inhibitory processes via the negative-going impulse response functions and from this, inhibition at interpulse delay intervals of 40-70 ms can be inferred.

Watson and Nachmias: Ellipse Models of Sinusoidal Gratings

Watson and Nachmias (1977) adopted Rashbass' (1970) ellipse model of two-pulse detection (see Eq. 8), and added a candidate impulse-response function. They also assumed that the transduction parameter varies from observer to observer, and was not simply set to 2 as in Rashbass' model.

Linear-Filter Front End

The initial stages of the Watson and Nachmias model consist of a low-pass linear filter with an impulse response function that contains a large inhibitory lobe. To construct this impulse response function, they construct the impulse response to an n-stage low-pass filter and add it to a delayed and inverted replica of itself. This gives an impulse response function of the form,
Figure 13. Response properties of various linear filters. Upper-left panel: Temporal frequency response function for a low-pass linear filter, which describes the temporal sensitive of the observer at background levels of less that 30 cd/m2. Upper-right panel: Impulse response function for a low-pass linear filter. Lower-left panel: Temporal frequency response function for a band-pass linear filter, which describes the temporal sensitivity of the observer at background levels of greater that 30 cd/m2. Lower-right panel: Impulse response function for a band-pass linear filter. As inhibition is introduced to the impulse response function, the temporal sensitivity falls off at lower temporal frequencies. Adapted from Di Lollo and Bischof, submitted.

0 t ² 0

h(t) = tn-1 e-t/t 0 < t ² s Eq. 22

tn-1 e-t/t - (t - s)n-1 e-(t - s)/t s < t

where s and t are free parameters chosen to give the model predictions some resemblance to empirical temporal inhibition findings. Figure 14 shows an example impulse response function for this filter.
Figure 14. Hypothetical impulse response function h(t) from Eq. 22. Adapted from Watson and Nachmias (1977).

Probability Summation in Time

Watson and Nachmias extended Rashbass' model beyond quadratic transduction. They adopt Watson's Probability Summation in Time model (1978) which assumes that at each point in time the stimulus representation has some finite probability of exceeding some internal threshold and being detected. At threshold the following relationship holds:

Eq. 23

where a and b are chosen to be sufficiently long, h(t) is from Eq. 22, f(t) is the stimulus contrast function of the stimulus describing both pulses and represents convolution. b is a parameter describing the slope of the psychometric function, and typically has a value of between 3 and 6.

This model predicts the elliptical relationship observed in Rashbass' original data (see Figure 5), but will also predict thresholds for long-duration stimuli. For instance, the model correctly predicts that when the delay between the two pulses is long, the threshold contour will be a square with rounded corners, not an ellipse (Watson & Nachmias, 1977).

Watson's Working Model

Recently, Watson (1986) has amended his original conceptualization of the impulse response function to make it more flexible in terms of the amount of temporal inhibition. He defines the impulse response function as the difference between two gamma functions, each of which has different temporal parameters n and t,

h(t) = x[h1(t) - zh2(t)] Eq. 24

Here h1(t) and h2(t) are both monophasic gamma functions with different t parameters as defined by Eq. 15. The z parameter is the "transience factor" that determines the amplitude of the negative lobe of the overall impulse response function. The x parameter is a sensitivity factor or gain factor that scales the impulse response in amplitude. The time constant for h1, t1, is always shorter than t2, which is the time constant for h2. This provides an impulse response function that initial is excitatory and then inhibitory.

Adding the transience factor z allows the band-pass characteristics of the model to change under different stimulus characteristics. Thus a larger model could be built around Eq. 24 that parametrically controlled z according to the stimulus size or adaptation level.

Uchikawa & Ikeda: Chromatic Double Pulses

Uchikawa and Ikeda (1986) adopted Watson's probability summation in time to model equal-luminance chromatic double pulses. For red and green stimuli, the background was set to a unique yellow of l = 571 nm. Red and green stimuli were produced via deviations from this background; red stimuli were created with either 20 or 15 nm deviations, and green stimuli were created with -35 or -30 nm deviations. For yellow and blue stimuli, the background was set to a unique green of 518 nm. Blue stimuli were produced via deviations of -35 and -33 nm from this reference wavelength, and yellow stimuli were chosen that deviated 75 and 65 nm from the reference.

Surprisingly no temporal inhibition was detected, although clear evidence for temporal inhibition was found for stimuli that consisted of changes in luminance. Thus it appears that the visual channels that are responsive to only chromatic changes do not show temporal inhibition, and may be characterized by a monophasic impulse response function like that given by Eq. 15. This result is consistent with reports from the spatial domain that the chromatic channels are low-pass (e.g. Mullen, 1985).

Roufs and Blommaert: Signal Perturbation Technique

Characterizing the exact form of the impulse response function has been recognized as an important question by researchers working in temporal summation, and Roufs and Blommaert (1981) suggested a potentially promising technique. They adopt the linear filter approach and assume that a pulse is only detected if the sensory response function exceeds some threshold. Once a threshold contrast has been determined, therefore, the peak of the sensory response function will just reach the observer's internal threshold and the stimulus will be detected.

Methodology

Roufs and Blommaert then set out to perturb the height of the sensory response function, using a test pulse presented at various delays from a probe pulse. The contrast of this test pulse is always less than 30% of the probe pulse's contrast, and both pulses were 2 ms in duration. When two pulses are presented, the responses to each pulse sum, and the stimulus is detected if the resulting sensory response function exceeds the observer's internal detection threshold.

Figure 15 explains the logic of this paradigm. If the test and probe pulses are presented at the same time, then the positive lobe of the test pulse will sum with the positive lobe of the probe pulse, driving the overall curve higher. This results in a lower contrast threshold for the probe, because less probe contrast is required to achieve the same overall sensory response function height.
Figure 15. Perturbation of the probe response by the test pulse response. The peak of the probe response will increase if the test and probe are presented at the same time (test delay = 0), decrease if the test precedes the probe (test delay = -50), and remain unchanged if the test follows the probe (test delay = 50). As the height of the probe response changes, the amount of probe contrast required to achieve a certain detection threshold will also change, and thus the degree of summation between the two pulses can be inferred from the probe contrast threshold data. Adapted from Watson (1982).

If the test pulse precedes the probe by 50 ms, then the negative lobe of the test will subtract from the positive lobe of the probe, resulting in a lower overall sensory response function and a subsequent higher contrast threshold for the probe pulse to bring the sensory response function back up to the height of the detection threshold.

If the test pulse follows the probe pulse by 50 ms, then no change in the contrast threshold would be anticipated, because the test pulse begins after the probe response has already exceeded threshold. In the Roufs and Blommaert conceptualization, the only important component of the model is the height of the sensory response function (which is the sum of the probe and test responses) at its highest point.

From this technique, Roufs and Blommaert derive a rather unusual impulse response function. This impulse response function is triphasic, with an initial inhibitory lobe, a larger excitatory lobe, and then a smaller inhibitory lobe. This initial inhibitory lobe is contrary to much of the literature. Watson (1982), in a commentary on the Roufs and Blommaert methods, suggests that this triphasic impulse response is artifactual, and that the results could be explained by a more conventional biphasic impulse response function if one assumed probability summation.

Watson's Objections

The difference between Roufs and Blommaert's model and Watson's probability summation model is that the former model assumes that only the peak of the sensory response function is important. Alternatively, Watson's model assumes that any sensory response function that deviates from zero gives the observer some probability of detecting the signal. For example, in Figure 15 a test delay of 50 ms will not affect the height of the probe pulse response, but it does affect the overall shape of the sensory response function. Thus under a probability summation assumption, a test delay of 50 ms will affect the overall detectability of the stimulus, even though its effects occur at the lower portions of the probe pulse's response.

Watson concludes that, given the assumption of probability summation, a conventional biphasic impulse response function can account for the Roufs and Blommaert findings.

den Brinker: Fourth-Order Impulse Response

Despite Watson's objections to the peak detection conceptualization of Roufs and Blommaert's model, den Brinker (1989) and others continue to use this methodology. Based on similar techniques, den Brinker derived a fourth-order impulse response function, shown in Figure 16. This response has the same initial inhibitory lobe reported by Roufs and Blommaert.
Figure 16. Impulse response functions for two background adaptation levels, estimated using Roufs and Blommaert's signal perturbation technique. Adapted from den Brinker (1989).

Bowen: Peak Detection: Ratios of Peaks in # of Flashes Detected

The impulse response functions derived by Roufs and Blommaert as well as den Brinker are unusual because they contain an initial negative lobe, and are multiphasic. However, they are not the only researchers to propose an impulse response function with more that two lobes. Bowen (1989) used a two-pulse paradigm with very high contrast pulses, and asked naive observers to report the number of flashes they perceived. Often two pulses are perceived as three flashes.

Based on an analysis of the probability of reporting one, two or three flashes, Bowen derived a model that included a multiphasic impulse response function consisting of a damped sinusoid of the form,

g(t) = sin (20 ¹ t) e-t/0.1 Eq. 25

This model also assumes that the probability of detecting one, two or three flashes depends upon the ratio of the peaks of each lobe. If two pulses are presented at a short delay, and each engenders a multiphasic impulse response function, then three flashes will be reported if the ratio of the second peak to the first peak is above some criterion value and the ratio of the third peak to the first peak is also above the same criterion value. Thus multiple flashes are perceived to the degree to which the height of subsequent peaks in the sensory response function achieve a height similar to the height of the first peak.

Ohtani and Ejima: Sinusoidal Gratings

Perhaps the most complete investigation of the spatial and temporal properties of the inhibitory mechanisms in temporal summation is a study by Ohtani and Ejima (1988). They used sinusoidal gratings in a two-pulse paradigm, and chose gratings that were either in-phase or out of phase for the two pulses. The gratings were 6.9 ms in duration, and separated by a variable stimulus onset asynchrony (SOA). Subjects performed a simple detection task on each trial, and the reciprocal of the detection threshold contrast of the two pulses was taken as a measure of sensitivity to each pulse pair.

Contrast thresholds were measured at four different spatial frequencies, at four different background luminance levels, and for a range of stimulus onset asynchronies. Figure 17 shows the measured sensitivity values for one observer.
Figure 17. Two-pulse sensitivity as a function of SOA for two phase conditions, for different spatial frequencies. The curve parameter value is the mean retinal illuminance in trolands. Solid and open circles represent the data of in-phase and out-of-phase pairs, respectively. The 1100 td conditions are on a true scale, while all other conditions have been displaced upward by 1 log unit to prevent overlap. The solid lines are predictions of a linear-filter model of two-pulse sensitivity, as discussed in the text. Adapted from Ohtani and Ejima (1988).

The in-phase and out-of-phase conditions allow a straightforward investigation of the temporal summation processes. Temporal inhibition is unequivocally indicated if the sensitivity to an out-of-phase stimulus is higher than that of the in-phase stimulus at the same SOA. The observer in Figure 17 clearly shows this effect for low spatial frequencies (0.75 cycles-per-degree) for the three highest background illuminances, for SOA's ranging from 25-100 ms. Temporal inhibition improves out-of-phase sensitivity at these SOA's because the negative (inhibitory) lobe from the first pulse is combining with the positive lobe from the second pulse to improve overall sensitivity. The logic of this improvement is shown in Figure 18.
Figure 18. Response functions engendered by the positive-contrast pulse (top panel), the negative-contrast pulse at a 50 ms delay (middle panel) and the combined sensory response to both pulses presented together. The negative lobe of the first pulse sums with the negative lobe of the second pulse, giving the overall stimulus response function more area. This area makes this stimulus more detectable, and thus will have a lower contrast threshold according to the Ohtani and Ejima model. A positive-positive pulse will be less detectable, because the negative lobe of the first pulse will sum with the positive lobe of the second pulse, giving the overall function less area.

The model adopted by Ohtani and Ejima is based on Watson's probability summation in time model, and includes a rectification and a transduction function. The specific nature of the impulse response that they adopt is similar to Watson's working model, in that it has an overall shape that is a function of excitatory and inhibitory components, each with its own time-constant. They are able to derive estimates of the parameters that govern the facilitory and inhibitory components of the impulse response function for each set of conditions from Figure 17. From these parameter fits they conclude that increases in retinal illuminance makes the visual response faster, irrespective of spatial frequency. However, the development of the inhibitory response with retinal illuminance depends on the grating's spatial frequency.

Summary of Two-Pulse Sensitivity Models

Table 2 summarizes the characteristics of the various models. In general, the two-pulse paradigm seems most effective for determining the nature of temporal inhibition, and stimuli that use both positive- and negative-contrast stimuli give clearest evidence for temporal inhibition. Despite claims to the contrary, no paradigm yet developed has successfully measured or derived the exact nature of the impulse response function. Thus researchers have proposed a number of different candidate impulse response functions that have one, two, three or even more relative peaks. However, no one candidate function has accounted for performance in all tasks.
Table 2. Characteristics of Models of Two-Pulse Performance
Authors
Stimuli
Model Specifics
Nature of the Impulse Response Function
Predicts Inhibition
Conditions
Ikeda (1965)
30' disks
Derived a summation index, from which the temporal response functions for positive and negative stimuli may be inferred.
unknown, but has inhibitory lobe
yes
++, +-, -+, --
Rashbass (1970)
17° disks
Assumed a low-pass linear filter, that is squared and integrated. Predicts elliptical relationship for contrast thresholds.
unspecified
yes
++, +-, -+, --
Sperling & Sondhi (1968)
Gratings
The nature of the temporal inhibition changes with background luminance level.
biphasic, varies with background level
yes
Watson & Nachmias (1977)
Gratings
Extends Rashbass model to include probability summation in time (PST).
biphasic
yes
++, +-, -+, --
Uchikawa & Ikeda (1986)
Iso-Luminant disks
Adopts Watson's PST model to predict iso-luminant thresholds. No inhibition found in iso-luminant stimuli.
monophasic
no
++
Roufs & Blommaert (1981)
1° foveal disk
Derives form of impulse response function by looking at interaction of two responses at peak height. Criticized by Watson for ignoring effects of probability summation in time.
triphasic, with an initial inhibitory lobe
yes
++
den Brinker (1989)
1° disk
Adopts Roufs & Blommaert model.
fourth-order multiphasic with an initial inhibitory lobe
yes
++
Bowen (1989)
high-contrast 25' disk
Looked at ration of peaks in sensory response function to predict the number of perceived flashes.
multiphasic, damped sinusoid
yes
++
Ohtani & Ejima (1988)
Gratings
Adopt Watson's PST model. Infer the characteristics of the low-pass filter under different stimulus characteristics.
biphasic
yes
++, +-, -+, --
Busey & Loftus (1994)
Digits
Assumes a threshold such that information is extracted at some rate once the sensory response function exceeds this threshold.
monophasic
no
++
Notes: Adapted from the original source materials.

Empirical Evidence for Inhibition in Character Detection and

Identification Tasks

The picture that emerges from the two-pulse detection literature is fairly clear: The temporal inhibitory mechanisms depend on the adaptation level, as well as on stimulus size. Stimuli that are shown on adaptation levels higher than about 5 cd/m2 and are larger than about 13 arc min. show evidence of producing inhibitory mechanisms. Delay intervals between the two pulses of about 40-70 ms provide the largest evidence for temporal inhibition. The two-pulse paradigm with both positive- and negative-contrast pulses is perhaps the best paradigm for demonstrating evidence for temporal inhibition.

These detection tasks differ in a number of ways from the more cognitive character identification tasks that are the focus of the current work. First, stimuli presented in detection tasks are by definition near threshold, while stimuli in character identification tasks are presented at supra-threshold contrasts. Second, characters represent complex stimuli that span a range of spatial frequencies. Third, characters are common, well-learned visual objects. Although the initial visual processing of disks and characters might be the same, the information extraction components of the character identification processes may provide distinct differences between detection and identification tasks.

A final reason that we might expect differences in temporal inhibition with characters was provided in a recent Nature article. Solomon & Pelli (1994) used backward sine-wave grating masks to measure the filter of the mechanism that was responsible for character identification. Characters contain a broad spectrum of spatial frequency information, and presumably if one band of frequency was masked we could attend to other bands to make a character identification. This is certainly the case in audition, because observers can perform off-band listening to avoid the putative effects of noise masking at a certain frequency band (Patterson & Nimmo-Smith, 1980).

However, this assumption turns out to be incorrect: when characters are masked by a three-cycle-per-character grating, sensitivity suffers. Thus it appears that character identification is mediated primarily by a three-cycle-per-character visual spatial filter. Information outside this band is either unavailable or irrelevant for character identification.

These three pieces of evidence suggest that character identification differs in several ways from other forms of pattern recognition. These differences in the spatial domain suggest that we might also expect differences in the temporal domain. Thus it is not clear a priori that character identification tasks will show the same temporal inhibition that was produced by threshold detection data.

The goal of the two experiments discussed below was to test this question: Do character detection and identification tasks show evidence of temporal inhibition that is consistent with the two-pulse detection literature? If so, can the character identification model of Loftus and Busey (Loftus, Busey & Senders, 1993; Busey & Loftus, 1994) be extended to account for this temporal inhibition? Experiment 1 is a threshold detection task that uses digits to verify that temporal inhibition exists for the detection of characters presented at threshold contrasts. Experiment 2 is a supra-threshold character identification task that examines the temporal inhibition present in a digit recall task under supra-threshold conditions. Experiment 1 focuses more on the initial visual processing stages, because the identity of the digit is not important. Experiment 2 then examines the temporal inhibition present when the task takes on more cognitive demands.

Although Experiment 1 is a detection task and Experiment 2 is an identification task, together the two do not represent a true Identification/Detection paradigm. The identification task in Experiment 2 uses somewhat different stimuli (4 digits verses 1 digit) and a different dependent measure (proportion correctly-recalled digits vs. contrast sensitivity). These differences do not allow comparisons of contrast sensitivity ratios like those that have been reported in the literature for digits (Busey & Loftus, 1994 Experiment 6) and for moving gratings (e.g. Palmer, Mobley & Teller, 1993; Derrington & Henning, 1993; Lindsey & Teller, 1990).

Experiment 1: Threshold Detection of Pulsed Digits

To investigate whether character stimuli show the same patterns of temporal inhibition, a single digit was presented for two pulses and observers made a two-temporal-interval-forced choice (2AIFC) response of the interval that contained the stimulus. Four types of conditions were used: ++, +-, -+ and --, where + indicates a positive-contrast pulse and - indicates a negative-contrast pulse. The second pulse was delayed for one of 6 intervals ranging from 0 to 120 ms. The pulses were presented in one of two temporal intervals, and the subject had to identify the correct interval; this made the identity of the digit irrelevant. The contrast of the digits was manipulated to determine the contrast that yielded 81% correct. We then modeled these contrast thresholds with the theory to see if the linear filter model of Loftus and Busey could account for the obtained same-contrast and different-contrast thresholds.

Method

Stimulus presentation and response collection was carried out on a Macintosh II computer.

Observers

Three observers, the author (TB), and two male graduate students (SD and MB) participated in both Experiments 1 and 2. All observers had participated in a minimum of 1000 trials prior to participating in Experiment 1.

Stimuli and apparatus

The Experiment was controlled by a Macintosh II computer and stimuli were presented on an Apple Monochrome monitor with a Video Attenuator to allow a greater range of luminance values. Observers sat approximately 57 cm away from the screen in a dimly lit room, and used the computer keypad to respond.

The background luminance was set to 7.2 cd/m2, and the fixation point had a luminance of 3.7 cd/m2. Contrast was defined as (Foreground Luminance - Background Luminance)/(Foreground Luminance + Background Luminance).

The digits were either a 2 or a 5, drawn in Times-Roman 14 point font. The digits were each 0.50° high by 0.40° wide and always appeared centered vertically on the screen. The top portion of the digit was 0.27° below the fixation point.

Design

In all trials, the duration of each of two pulses was 30 ms (two screen refreshes). The two pulses were separated by one of 6 delays, ranging from 0 ms to 120 ms, during which the screen was the background luminance. The two pulses were always of the same magnitude, although may have been in different directions (positive or negative contrast) depending on the stimulus condition.

Observers completed 25, 96-trial blocks, which provided 100 observations per condition per observer.

Procedure

A trial consisted of two temporal intervals separated by low tones, each of which contained the stimulus with 50% probability. After the second interval, the observer indicated which of the two intervals they thought contained the digit, guessing if necessary. He then received feedback in the form of a tone.

An extension of this experiment was conducted with Observer TB, in which the identity of the digit, which was either 2 or 5, was the relevant stimulus attribute.

An adaptive threshold finding procedure, Quest (Watson and Pelli, 1983), was used to adjust the contrast of the pulses to find the contrast that provided an 81% detection rate.

Results and Discussion

Figure 19 shows the results of three observers for Experiment 1; the smooth curves through the data points are explained below. For an interstimulus interval (ISI) of zero, observers were more sensitive to the same-sign condition than the opposite-sign condition, which is indicated by the lower threshold for the former condition. However, when a small ISI of 15 to 30 ms was introduced, this pattern reversed and all three observers became more sensitive to the opposite-sign condition than to the same-sign condition. These differences between the two conditions grew largest at 30-45 ms ISI. At the empirical level, these results show that digit stimuli show the same kinds of detection curves that others classically report for simpler stimuli such as disks or gratings. Thus it appears that the detection of digit stimuli is no different than the detection of other stimuli.

At the theoretical level, these findings clearly implicate temporal inhibition, and the results are quite similar to Ohtani and Ejima (1988, see Figure 17). For comparison with their results, our background level of 7.272 cd/m2 translates to about 90 td, which is in the mesopic range.

Figure 19. Experiment 1 results for three observers. Conditions are averaged across stimulus type, such that the ++ and -- conditions are shown as the Same-Sign case, and the -+ and +- conditions are shown as the Opposite-Sign case. Error bars represent the standard deviation of the threshold-finding function. The finding of greater sensitivity to opposite-sign conditions at some delay intervals indicates temporal inhibition and disconfirms the monophasic impulse response function of the Linear Filter Model of Character Identification. When this theory is modified to include a biphasic impulse response function, the theory does a fairly good job of account for the data. These predictions are shown as lines on the curves. The exception is at long interstimulus intervals for observer TB. The lower-right panel shows the data and model fit for Observer TB for an identification task, which is described in the text. These data demonstrate a similar pattern of temporal inhibition, although the overall sensitivity is reduced (note the scale of the ordinate).

Temporal inhibition as revealed by this task begins at delay intervals of 15-30 ms, and extends to 45-70 ms for the three observers. This is consistent with the ranges given in Table 1, which show temporal inhibition for delay intervals of 30-70 ms for a variety of display conditions. The temporal inhibition for Observer TB in the identification task (see Figure 19, lower-right panel) appear to exist for slightly longer display intervals (45-70 ms) than compared to the detection task for the same observer (30-70 ms). In addition, the size of the temporal inhibition appears slightly smaller for the identification task.

Differences Across Conditions

The results in Figure 19 are averaged across the sign of the contrast of the stimuli (++ and -- were averaged together, as were +- and -+). Consistent with previous findings (Blackwell, 1963; Boynton, 1972; Rashbass, 1970) no systematic differences were found between the individual conditions. The exception was Observer SD, who had small but systematic deviations between the ++ and -- contrast thresholds. He was more sensitive to the -- condition, which had a dark character on a gray background. However, these deviations were consistent across ISI, and thus the average does not distort the overall pattern of the contrast thresholds.

Identification Task

For direct comparison with the detection task in Experiment 1, identification data was collected for Observer TB. The stimuli were either a 2 or a 5, and the contrast of the digits was systematically adjusted until identification performance reached 81%. The stimuli were presented in a single temporal interval; display conditions were otherwise identical to those of Experiment 1.

The obtained identification contrast sensitivity values are shown in the lower-right panel of Figure 19, along with the corresponding model fit that includes a biphasic impulse-response function. In general these data are not much different than the detection data, although the overall sensitivity values differ (note the different ordinate scales).

Model Predictions

The Linear-Filter Model of Character Identification was applied to the data in Figure 19. Following Busey & Loftus (1994, Experiment 6), the theory may be extended to detection tasks by assuming that detection performance is proportional to the above-threshold area under the sensory response function. This theory, which assumes a monophasic impulse response function and therefore no temporal inhibition, is clearly disconfirmed by the evidence for temporal inhibition seen at 45-60 ms ISI ranges in all three observers. However, the model may be extended to account for these data by adopting Watson's working model's characterization of the impulse response function (see Eq. 24). His model assumes that the impulse response function is the sum of two gamma functions, one with a slightly longer time-constant, that is subtracted from the first gamma function to introduce an impulse response function with a negative-going second lobe. The amount of inhibition is controlled by an inhibitory weight z that scales the second gamma function.

The sensory response function is rectified to eliminate negative-going portions of the functions, and then the sensory threshold is applied. The above-threshold area is then used to predict performance. The curves in Figure 19 were generated by fitting the modified Character Identification Theory of Loftus and Busey to the detection thresholds.

Figure 20 demonstrates how the extended model accounts for the finding of better sensitivity for opposite-signed pulses than for same-signed pulses at 30-60 ms delay intervals.

Figure 20. Response functions for same-sign and opposite-sign stimuli. Left panel: Rectified sensory response function for the same-contrast stimulus at 45 ms ISI. The inhibitory lobe of the first response subtracts from the initial positive lobe of the second response, reducing the above-threshold area. Right panel: Rectified sensory response function for the opposite-contrast stimulus at 45 ms ISI. The inhibitory lobe of the first response adds to the initial negative lobe of the second response, reducing the above-threshold area. Since performance is proportional to the above-threshold area, this model correctly predicts that opposite-contrast sensitivity will be better than same-contrast sensitivity at the 45 ms ISI.

The best-fitting predictions for the revised model are shown in Figure 19. These fits are all quite acceptable, except perhaps for Observer TB at a 120 ms ISI. The parameter values for each fit are given in Table 3, and are all in reasonable ranges.

Experiment 1 clearly demonstrates evidence for temporal inhibition in a detection task in which characters are used as stimuli. This is true even when the identity of the digit becomes important, as with Observer TB. This temporal inhibition clearly demonstrates the need for a model that assumes temporal inhibition, and when the character identification theory is modified to include this component it provides good predictions for the empirical data. This demonstrates that while characters processing is quite different from detection of simpler stimuli such as disks (see Pelli, 1994), such processing still exhibits the same temporal inhibition.
TABLE 3. Summary of Best-Fitting Model Parameters for Experiments 1-2. RMSE's are in units of log(1/contrast) for Experiment 1 and units of -ln(1-p) for Experiment 2
Experiment 1 (n=9)
Observer
t
cs
Q
inhib. tau: t2
inhib. weight: z
RMSE
TB
5.69
0.410
0.0201
12.09
0.678
0.0364
SD
6.43
0.310
0.0059
8.69
1.080
0.0288
MB
7.04
0.389
0.0138
11.58
0.666
0.0240
TB Identification
4.42
0.576
0.0383
15.44
0.477
0.0387

Experiment 2 (n=9)
Observer
t