Unitization During Category Learning

Robert L. Goldstone

Indiana University



Five experiments explored the question of whether new perceptual units can be developed if they are diagnostic for a category learning task, and if so, what are the constraints on this unitization process? During category learning, participants were required to attend either a single component or a conjunction of five components in order to correctly categorize an object. In Experiments 1-4, some evidence for unitization was found in that the conjunctive task becomes much easier with practice, and this improvement was not found for the single component task, or for conjunctive tasks where the components cannot be unitized. Influences of component order (Experiment 1), component contiguity (Experiment 2), component proximity (Experiment 3), and number of components (Experiment 4) on practice effects were found. Using a Fourier Transformation method for deconvolving response times (Experiment 5), prolonged practice effects yielded responses that were faster than expected by analytic model that integrate evidence from independently perceived components.

Unitization During Category Learning

Interest in the mutual interactions between our perceptual and conceptual systems has resurfaced repeatedly. In anthropology, Sapir and Whorf (Whorf, 1941/1956) posited influences of language on perception, and in psychology, the "New Look" movement (Bruner & Postman, 1949) explored situations where high-level cognitive processes altered people's ability to identify objects, make discriminations between objects, and make accurate judgments of an object's attributes. Both of these historical movements have been resurrected, as new evidence for linguistic relativity (Gumperz & Levinson, 1991) and a "New New Look movement" (Niedenthal & Kitayama, 1994) have emerged.

The research reported here explores one possible influence of conception on perception: the functional unitization of perceptual components due to categorization experience. It is generally acknowledged that our categorization judgments are driven by our perceptions, and this influence is incorporated in almost all successful models of categorization (Nosofsky, 1986). However, it is possible that our categorizations also drive our perceptions. In particular, if a certain stimulus component is diagnostic for a required categorization, people can increase their categorization accuracy by becoming selectively tuned to the component. If the component is already part of the person's perceptual "vocabulary," then categorization demands can be met simply by increasing attention to the component (Kruschke, 1992; Nosofsky, 1986). However, if the component does not already exist in one's vocabulary, then becoming tuned to the component first entails adding it to the vocabulary (Goldstone & Schyns, 1994; Schyns, Goldstone, & Thibaut, in press; Schyns & Murphy, 1994).

"Perceptual vocabulary" refers to the elementary building blocks used to create object representations. Evidence for membership in a perceptual vocabulary is typically based on functional behavior. For example, Treisman and Gelade (1981) argues for a vocabulary of features that includes color, orientation, and closure elements. These feature are empirically identified by: response times to targets that are not influenced by the number of distractors if the targets are distinguished by a feature, fast segregation of textures if the textures have different features (Julesz, 1981), illusory conjunctions involving features from different objects if attention is not focused on the objects, and all-or-none disappearance of features when images are retinally stabilized (Hebb, 1949). Evidence for vocabulary elements can also be obtained by neurophysiological recordings, such as Hubel and Wiesel's (1968) original work indicating primary visual cortex neurons selectively tuned to particular line orientations (for most recent work, see Perrett & Oram, 1993).

The above research is based on the implicit assumption that a fixed vocabulary of features can account for our perceptual experience. In fact, some support for this assumption comes from results showing little influence of practice in augmenting feature vocabularies. For example, Treisman & Gelade (1981) found that prolonged experience at a conjunctive feature search task (searching for a blue "O" among blue "T"s and red "O"s) did not result in a significantly reduced dependency of response times on the number of distractor elements, suggesting that the conjunction of color and form features was not added as a separate vocabulary element over time. However, other research has shown an influence of life-long experience on feature search response times (e.g. Wang, Cavanagh, & Green, 1994). The current research further tests the adequacy of this "fixed vocabulary" hypothesis, not within a feature search paradigm, but in a categorization task. The particular type of vocabulary additions that we will explore are unitizations - situations where a single "chunk" is formed from perceptual components if the chunk is diagnostic.

Perceptual Consequences of Category Learning

The notion that category learning can have perceptual consequences has recently been the source of substantial empirical inquiry. Work on categorical perception has shown that our ability to make perceptual discriminations depends on the categories we possess (see Harnad, 1987). Specifically, discriminations involving pairs of stimuli that straddle a category boundary are more easily made than are discriminations involving stimuli that fall within the same category, equating for physical dissimilarity between the pairs. Further research has indicated that the categories that influence perceptual discriminability need not be innate, but rather may be learned within a laboratory context (Goldstone, 1994; Lane, 1965). Laboratory-acquired categories also can influence perceptually-based similarity judgments; objects that are placed in different categories are given lower similarity judgments than the same items are when they are placed in the same category (Harnad, Hanson, & Lubin, 1994; Kurtz, 1996). Lin and Murphy (in press) have shown that the conceptual knowledge associated with a category, manipulated by assigning the same categories and items with different functional descriptions, can influence subjects' speed at verifying perceptual properties. In short, learned categories can influence the speed and sensitivity with which perceptual properties are processed.

In addition, category learning can affect how an object is segmented into parts. Schyns and Murphy (1993, 1994) formulated a Functionality Principle whereby if a fragment of a stimulus categorizes objects (distinguishes members from nonmembers), the fragment is instantiated as a unit in the representational code of object concepts. Consistent with this principle, they found that objects were more likely to be segmented into parts that were useful for categorizing objects. Subjective segmentations were obtained by asking participants to draw outlines around the parts of objects. A similar influence of categorization on the segmentation of objects was shown by Pevtzow and Goldstone (1994). Stick figures composed out of six lines were categorized in one of two ways. Different arbitrary combinations of three contiguous lines were diagnostic for the different categorizations. After categorization training, participants made part/whole judgments, responding as to whether a particular set of three lines (a part) was present in a whole stick figure. Participants were significantly faster to determine that a part was present in a whole when the part was previously diagnostic during categorization. Whereas previous research has focused on objective properties of an object (e.g. the proximities, similarities, and shapes of the line segments) that determines how people will decompose it into segments (Palmer, 1978), the above results indicate that the person's experience also influences how they will segment an object into parts.

A final related source of evidence indicating an influence of concept learning on perceptual units that are formed comes from the "inversion effect" (Diamond & Carey, 1986; Tanaka & Gauthier, in press; Tanaka & Farah, 1993; Yin,1969). According to this effect, the recognition cost of rotating a stimulus 180 degrees in the picture plane is much greater for specialized, highly practiced stimuli than for less specialized stimuli. For example, recognition of faces is substantially less fast and accurate when the faces are inverted. This large difference between upright and inverted recognition efficiency is not found for other everyday objects. Diamond and Carey (1986) found a large inversion cost for dog breed recognition, but only for dog experts. Similarly, Gauthier and Tarr (in press) found that large inversion costs for a particular nonsense object can be created in the laboratory by giving participants prolonged exposure to the object. They conclude that prolonged experience with an object leads to a configural representation of it that combines all of its parts into a single, viewpoint specific, functional unit.


The previously reviewed literature indicates that one result of category learning is to create perceptual units that combine stimulus components that are useful for the categorization. In the field of attention, a similar process of unitization has been explored. Using a task where participants decided whether two stimuli were identical, Laberge (1973) found that when attention was not placed on the stimuli, participants were faster at responding to actual letters than to letter-like controls. Furthermore, this difference was attenuated as the unfamiliar letter-like stimuli became more familiar over practice. He argued that the components of often-presented stimuli become processed as a single functional unit with practice.

More recently, Czerwinski, Lightfoot, and Shiffrin (1992) have referred to a process of perceptual unitization in which conjunctions of stimulus features are "chunked" together so that they become perceived as a single unit. Shiffrin and Lightfoot (in press) argued that separated line segments can become unitized following prolonged (over 15 hours) practice with the materials. Their evidence comes from the slopes relating the number of distractor elements to response time in a feature search task. When participants learned a conjunctive search task in which three line segments were needed to distinguish the target from distractors, impressive and prolonged decreases in search slopes were observed over 20 sessions. These prolonged decreases were not observed for a simple search task requiring attention to only one component. In addition, when participants were switched from a conjunctive task to a simple feature search task, there was initially little improvement in search times, suggesting that participants were still processing the stimuli at the level of the unitized chunk that they formed during conjunctive training. The authors concluded that conjunctive training leads to the unitization of the set of diagnostic line segments, resulting in fewer required comparisons.

Other evidence for unitization comes from word perception. Researchers have argued that words are perceived as single units due to people's life-long experience with them. These word units can be processed automatically and interfere with other processes less than do nonwords (LaBerge & Samuels, 1974; O'hara, 1980; Smith & Haviland, 1972). Results have shown that the advantages attributable to words over nonwords cannot be explained by the greater informational redundancy of letters within words (Smith & Haviland, 1972). Instead, these researchers argue for recognition processes that respond to information at levels higher than the individual letters.

A New Source of Evidence for Unitization

The purpose of the current set of experiments is to test whether category learning can lead to stimulus unitization, and to explore the boundary conditions on unitization related to stimulus characteristics and amount of training. Our experiments explore unitization, but from a somewhat different perspective from the work in attention. First, our experiments are primarily interested in the influence of category learning on unitization, under the hypothesis that a unit will tend to be created if A) the parts that comprise the unit frequently co-occur, and B) the unit is useful for determining a categorization. Second, a new technique for analyzing response times is developed that compares response time distributions from a conjunctive categorization task to the expected distribution based on analytic models that do not incorporate unitization. Evidence for unitization is obtained if the observed response time distribution for a conjunctive categorization task contains response times that are faster than predicted by analytic models that base conjunctive responses on several individual component judgments.

Whenever the claim for the construction of genuinely new units is made, two objections must be addressed. First, couldn't the unit have existed in people's vocabulary before categorization training? Our stimuli are designed to make this explanation unlikely. Each unit to be sensitized is constructed by connecting 5 randomly chosen curves. There are 10 curves that can be sampled, yielding 510 possible different units. As such, if it can be shown that any randomly selected unit can be sensitized, then an implausibly large number of vocabulary items would be required under the constraint that all vocabulary items are fixed and a priori. The second objection is that no units need be formed; instead, people analytically integrate evidence from the five separate curves to make their categorizations. However, this objection will be untenable if subjects, at the end of extended training, are faster at categorizing the units than would be expected by the analytic approach. Quantifying what "faster than expected" means is the main business at hand, and will not fully be addressed until Experiment 5.

Experiment 1

Experiment 1 explores the unitization of visual components to attain greater efficiency in a categorization task. The categorization task is designed so that evidence from five components must be received before certain categorization responses are made. For this reason, the critical categorization case is a conjunctive task. The stimuli and their categorizations are shown in Figure 1. Each letter refers to a particular segment of the stimulus. Each stimulus is composed of five segments, augmented by a broad U-shape in order to create a closed object. To correctly place the stimulus labeled "ABCDE" into Category 1, all five components, "A," "B," "C," "D," and "E," must be processed. For example, if the right-most component is not attended, then "ABCDE" cannot be distinguished from "ABCDZ" which belongs in Category 2. Not only does no single component suffice for accurate categorization of "ABCDE," but two-way, three-way, and four-way conjunctions of components (posited by researchers such as Gluck & Bower, 1988, and Hayes-Roth & Hayes-Roth, 1977) also do not suffice. For example, the three-way conjunction "C and D and E" is possessed by the stimulus "ABCDE," but this conjunction does not discriminate "ABCDE" from "AWCDE" or "VBCDE." Only the complete five-way conjunction suffices to reliably categorize "ABCDE."

If unitization occurs during categorization, then it is possible that the stimulus "ABCDE" becomes treated functionally like a single component with training. If this occurs, then participants should be able to quickly respond that this stimulus belongs to Category 1. Instead of responding by integrating the results from separate sections of each component, the categorization response may eventually be made by consulting a single detector. As this detector unit becomes established, response times to categorize the "ABCDE" pattern should decrease. In the current experiment, a pronounced decrease in the time required to categorize the conjunctively defined stimulus "ABCDE" will be taken as evidence of unitization.

In order for the conjunctive speed up to be taken as evidence for unitization, two important control conditions are necessary. First, it is important to show that tasks that do not require unitization do not show comparable speed ups. To this end, a control task is included that allows participants to categorize the item "ABCDE" by attending only a single component rather than a five-way conjunction. This "One" (component) condition should not result in as much speed up as the "All" (components) condition where all components must be attended. If it does, then the speed up can be attributed to simple practice effect rather than unitization. Second, it is important to show that stimuli that cannot be unitized also do not show comparable speed ups. For this control task, a five-way conjunction of components must be attended, but the ordering of the components within the stimulus is randomized. As such, a single template cannot serve to categorize the "ABCDE" stimulus. Assuming that unitization requires that the components to be chunked appear in an consistent or template-like manner (Gauthier & Tarr, in press; Schneider & Shiffrin, 1977), these randomly ordered stimuli should not afford unitization, and thus are not expected to yield significant speed ups.

In sum, Experiment 1 explores the possibility that five line segments, when combined together to create a contiguous, coherent object, may become treated as a single unit with practice. It is highly unlikely that the unit existed in the participants' perceptual vocabulary prior to training because the unit is randomly selected from an extremely large set of other units (the set of all of possible ways of ordering five segments, each chosen from a set of ten segments). The potential evidence from this experiment for unitization would come from substantial improvement in a conjunctive task that involves unitizeable stimuli, but not in a conjunctive task involving hard-to-unitize stimuli, and not in a simple component detection task that does not require unitization.


Participants. Seventy-two undergraduate students from Indiana University served as participants in order to fulfill a course requirement. The students were evenly split into the four between-subject conditions.

Materials. Stimuli were formed by selecting five line segments without replacement from a set of 10 segments. The displayed objects consisted of five curved lines combined together, and joined by a bowl shape. Each horizontally arranged segment was 0.8 cm long and 0.5 cm high. The entire length of a stimulus was 4 cm, and the height was 1.7 cm. The viewing distance was approximately 40 cm, yielding a visual angle of 5.7 degrees for each stimulus. The starting and ending points of each segment were located at the vertical midline, and consequently, all segments could be joined with each other to create a seamless stimulus. Sample stimuli are shown in Figure 1. Each of the ten segments can be associated with a different letter of the alphabet for descriptive purposes. The assignment of segments to letters was randomized for each participant. As such, the object that is abstractly represented by "ABCDE" was composed of different segments for different participants.

Design. In the "All" task, the objects that are abstractly described as "ABCDE" and "VWXYZ" belonged to Category 1, and the objects "ABCDZ," "ABCYE," "ABXDE," "AWCDE," and "VBCDE" belonged to Category 2. This stimulus structure is shown in Figure 1. Given these category memberships, all five curves need to be attended in order to reliably place the object "ABCDE" into Category 1. The five members of Category 2 were created by replacing one of "ABCDE"'s segments with a new segment. The object "ABCDE" was presented four times more frequently than any of the other six objects, which were all presented equally often. Given this, the category validities (the probability of a segment, given a category) for every segment and category were equal, as were the cue validities (the probability of a category, given a segment). The item "VWXYZ" was included in Category 1 so that no segment, by itself, would provide probabilistic evidence in favor of Category 1 or 2.

In the "One" task, "ABCDE" and "VWXYZ" were in Category 1, but only one of the five objects in Figure 1's second category was presented. Thus Category 2 only contained one object, randomly selected from: "ABCDZ," "ABCYE," "ABXDE," "AWCDE," or "VBCDE." The selected object was presented five times more frequently than it was in the All task, so that both categories were presented equally often. Given these categories, to categorize "ABCDE" as belonging to Category 1, it is only necessary to attend to a single segment. For example, if Category 2 contained the object "ABCDZ," then noticing the presence of segment "E" provides sufficient evidence for a Category 1 judgment.

Procedure. Participants were given 320 categorization trials. On each trial, an object appeared on the screen, and participants pressed one of two keys to indicate the object's category membership. Participants received feedback indicating whether their guess was correct, and if incorrect, were shown the object's correct categorization. Participants were told that both accuracy and speed were important and that they should make their responses as quickly as possible without sacrificing accuracy. Participants who made more than 5% errors on a particular block were asked to try to increase their accuracy. An equal number of Category 1 and 2 objects were displayed, randomly ordered. Each participant was given a different random order.

Half of the participants were given objects generated according to the All task, and the other half were given objects generated according to the One task. Four groups of participants were obtained by combining this All-vs-One split with a second orthogonal split: Ordered vs Random. For the Random group, any ordering of the same components counted as the same object. As shown in Figure 2, any one of the 120 orderings of the five components "A," "B," "C," "D," and "E" acted as an example of "ABCDE." In this condition, when an object was selected to be displayed, the spatial ordering of its five segments was randomized. In the Ordered condition, the spatial positions of the five segments was fixed. As such, the object "ABCDE" had the same exact appearance every time it was displayed in the ordered condition.

Figure 2

What object was displayed on a given trial was randomized, subject to the constraint that objects and categories were presented with their stipulated frequency over the course of the experiment. The spatial position of the object on the screen was randomized. The 320 trials were broken down into 4 blocks of 80 trials. Rest breaks were provided between blocks. During the break, the average accuracy and response time for the preceding block of trials was displayed. At the end of the experiment, fifty-eight of the 72 participants (those participants that completed the experiment with sufficient time remaining to answer questions) were asked what strategies they used to categorize items as quickly as possible. In particular, they were asked which of the following two strategies best characterized their behavior: A) "I tried to look for each of the parts of the doodle that was important for categorizing it," or B) "I tried to remember an overall image of what some of the doodles looked like, and used this to categorize them." The experiment took about 55 minutes to complete.


Figure 3 presents the results of most interest. A 2 (Random vs Ordered display) X 2 (All vs One component necessary) X 4 (blocks) ANOVA was conducted with correct response time as the dependent measure. For this primary ANOVA, only response times for the object "ABCDE" were considered, because it is only this object that requires the full set of five segments to be attended in the All condition. A main effect of display was found, F(1, 68) = 10.4, mse = 15.2, p < .01, such that Ordered displays were more quickly categorized than Random displays. A main effect of number of necessary components was found, F(1, 68) = 18.3, mse = 13.0, p < .01, with the One condition significantly faster than the All condition for the object "ABCDE." A main effect of blocks was found, F(3, 204) = 8.9, mse = 18.2, p < .01, indicating that practice increased speed of categorization.

However, these main effects were modulated by a predicted three-way interaction between the ANOVA variables, F(3, 204) = 9.1, mse = 13.7, p < .01. As shown in Figure 3, the practice effect due to blocks was most pronounced for the All, Ordered condition. There was also a substantial practice effect for the All, Random condition, but it was not even half as large as it was for the All, Ordered condition. Practice effects for the two One conditions were quite small.

Insert Figure 3 about here

In the All condition, response times for the other items were faster than for "ABCDE." Response times for the five items belonging to Category 2 were combined because these items were logically equivalent. Overall, collapsing across the ordered and random groups of participants, the Category 2 items were responded to in 1195 msec, which is significantly different from the 1354 msec required to respond to "ABCDE," paired T ( 35) = 5.4, p < .01. Similarly, although "ABCDE" occurred more frequently than "VWXYZ," the average correct response time for categorizing the latter object, 1121 msec, was significantly faster, paired T (35) = 4.9, p < .01. Presumably, the relatively fast response times for these two types of items is because any two-way conjunction of segments suffices to make the categorization, whereas "ABCDE" requires a full five-way conjunction (in the All condition). Relevant to the unitization hypothesis, the speed advantage of Category 2 items over "ABCDE" in the All condition decreases over blocks of practice, as shown by a significant Item X Blocks interaction, F(3, 102) = 4.6, mse = 13.2, p < .01.

The response time effects were mirrored by error analyses, although the latter were generally not significant because of the overall high level of accuracy. In no case was there any indication of a speed-accuracy tradeoff where one condition yielded faster response times but decreased accuracy than another condition.

Participants in the four different conditions differed in their self-report of strategy use. The percentages of participants claiming that they used the holistic strategy of memorizing entire objects, rather than looking for particular segments, were 25%, 7%, 75%, and 7% for the Random All, Random One, Ordered All, Ordered One conditions respectively, Chi-square (3) = 23.65, p < .05.


Experiment 1 provided suggestive, although not definitive, evidence for unitization of a consistent, multi-segment stimulus in a categorization task. Unitization was suggested by the pronounced and gradual speed-up in categorizing objects. This speed-up was particularly striking when the object to be categorized required attention to be placed on all of its parts, and when the parts could be easily combined into a coherent image.

The first of these provisions is indicated by the far greater improvement in categorization for the All task than for the One task. In fact, after the first block of training, there was no significant improvement in the One task. This is predicted if the All task benefits from the construction of functional units that span across multiple segments, and if the construction of these units takes time. The gradual nature of unitization is supported by Czerwinski et al's (1992) observation that improvements in a conjunctive feature search task were found after even as much as 20 hours of practice.

The second provision, emphasizing the value of practice for unitizeable stimuli, is suggested by the comparison of the Ordered and Random conditions. The All task for both of these conditions requires attention to all five of the segments that compose a stimulus. However, when the five segments were randomly located within the stimulus, then practice effects were greatly attenuated. One account for this difference is that a single unit that represents an entire stimulus can only be formed when it is presented in a visually constant form. Expressed in a slightly different way, unitized representation may be photograph-like images in that they preserve the stimulus information in a relatively raw, unprocessed, spatially constrained form. This conjecture gains support from other work showing that units that are acquired over practice preserve many of the properties of actual images (Gauthier & Tarr, in press; Shiffrin & Lightfoot, in press). If units are constrained to be image-like representations, then randomization of the segments within an image would preclude unitization because it prevents a stimulus from being associated with a single image.

The finding that practice effects were more pronounced for the Ordered All condition than for the Random All condition serves to eliminate alternative accounts of what cognitive process improved with practice. It might be thought that the stronger practice effects for the All, relative to One, task was due to segment comparison times that were facilitated by practice. By this reasoning, the All task required five times as many segment comparisons as did the One task; if practice made each comparison faster, then the All task would be expected to show five times the improvement of the One task. One problem with this account is that the practice effect for the Ordered All task was more than five times greater than it was for the Ordered One task. Also, the explanation cannot be this simple because the Random All task required the same number of segment comparisons as the Ordered All task did, but did not show a comparable practice effect. In fact, the Random All task required at least as many segment comparisons as the Ordered All task, and required more comparisons if some segments that were consistently placed together in the Ordered condition came to act as a single segment. Furthermore, the Random task required cognitive processes not required by the Ordered task; namely, it required a process of segment localization, in which the relative location of a particular segment within an object was determined. In sum, the pronounced practice effects that were found in the Ordered All condition were probably not solely due to developing faster processes for identifying/comparing specific segments or integrating the results from separate segment comparisons; if they were, then the Random All condition would have shown comparable practice effects. Further, the pronounced practice effect was not due to developing faster processes for locating a segment within an object; if it were, then the Random All condition would have shown larger practice effects. This analysis further suggests that the practice effects observed for the Ordered All condition were due to a process that created a single, image-like unit. This conclusion is also supported by the participants' reports of their own strategies; the holistic strategy of comparing an item to an entire stored object was heavily used only by the Ordered All participants.

The inclusion of the Random All task also allows another account for the Ordered All practice effect to be rejected. The differential practice effects between the One and All tasks cannot simply be due to a floor effect in the One task. Although the Ordered All task might be expected to demonstrate large response time improvements with practice because it had considerable room for improvement at first (response times on this task start at 1817 msec), the Random All task had equivalent room for improvement but did not show a comparable practice effect.

Two replications of Experiment 1 provided evidence concerning the boundary conditions on the unitization process that appears to have occurred during the Ordered All task. In the first replication, the size of the objects was increased from 4 cm to 10 cm. This size increase resulted in a greatly reduced practice effect for the Ordered All condition. The other three conditions were hardly affected. As such, there may well be a physical size constraint on unitization, such that only segments that can viewed clearly without saccades can be unitized. In the second replication, the U-shaped figure below the five segments was removed, with the result that the stimuli were no longer closed figures. This alteration had little effect on the response times or practice effects for the four conditions. Thus, unitization can proceed even when the unit is a line rather than a closed object.

One possible objection to Experiment 1 is that the participants' subjective segments may not correspond to the five segments that were used to construct the stimuli. This is certainly possible; in fact, efforts were made so that it would be difficult to determine the experimental segmentation of an object simply by viewing it. The current experimental logic does not require that participants would naturally decompose objects into the five segments that were used to create the objects. The five segments themselves are fairly complex forms, and are likely to be composed out of segments themselves. The experimental logic simply requires that whatever composition process is required to identify a single segment, more composition is required to make a reliably accurate response in the All task. The assumption underlying the All task is that it cannot be reliably performed using a single functional unit that the participant possessed before the start of the experiment. The task either requires the separate identification of several components (that themselves may be built from smaller components), or the construction of a single component that did not exist prior to the experiment. For the present purposes it is not necessary to determine the actual segmentations participants used to break the whole objects into components.

In sum, Experiment 1 provides suggestive evidence that unitization of a complex stimulus occurs, but only when an image-like representation can be formed for it. The results are only suggestive because direct evidence has not yet been presented that response times are faster for the conjunctively defined object than would be predicted by an analytic model that integrates evidence from separately detected components. Unitization could produce responses that are faster than predicted by such an analytic model by requiring only a single component detection. This type of evidence will be considered in Experiment 5.

Experiment 2

Experiment 2 explores boundary conditions on the unitization process suggested by Experiment 1. Specifically, Experiment 2 addresses the question of whether unitization requires stimulus components to be spatially connected. Whereas Experiment 1 manipulated unitizeability by presenting stimulus components in ordered versus random positions, Experiment 2 tests whether unitizeability is manipulated by presenting components in a contiguous versus separated fashion.

Previous research indicates that contiguity (spatial connectedness) plays a large role in perceptual parsing and organization. Palmer (1992, Palmer & Rock, 1994) has argued that one of the basic laws of perceptual organization is that stimulus parts tend to be grouped together if they are connected to one another. Bayliss and Driver (1993) found that judging the relation between two points is facilitated if they come from the same objects rather than separated objects.

An influence of connectedness has also been found to influence concept learning more specifically. Shepp and Barrett (1991) found that children are better able to acquire a conjunctive categorization when the conjoined dimensions are connected together in the same object than when they are physically separated. Nahinsky et al (1973) similarly found that physical proximity of stimulus parts facilitated acquiring concepts that involved several of the parts. Finally, Saiki and Hummel (in press) found that concepts that were based on the conjunction of the shape of an object and its spatial location relative to a second object were much more easily acquired when the two objects were physically connected than when they were separated. All three of these results indicate that conjunctions of stimulus parts are more efficiently used for category learning when the parts are contiguous. Given that Experiment 1 also involved the acquisition of a concept based on the conjunction of several parts, these previous studies might be interpreted as indicating that unitization should be greater for contiguous than separated components.

The prediction derived from the literature on contiguous and separated stimuli differs from the prediction made by the "imageability" hypothesis. According to this latter hypothesis, unitization can proceed as long as a single image-like representation can be formed for the set of components, and if the components are not separated by too great a distance. In fact, some previous research has indicated that contiguity is not a strict constraint on unitization. Czerwinski et al (1992) found unitization of sets of three disconnected line segments. Although disconnected, a single image could be formed for the set of line segments because of their consistent arrangement and small visual angle subtended.


Participants. Seventy-six undergraduate students from Indiana University served as participants in order to fulfill a course requirement. The students were evenly split into the four between-subject conditions.

Materials. The design and construction of the "Contiguous" materials was identical to Experiment 1, except that the lower bowl-shape was removed so as to provide a cleaner comparison to the Separated condition items. The "Separated" materials were constructed as in Experiment 1, except that the five segments within a stimulus were disconnected, and arranged vertically, as shown in Figure 4. The total longest distance between segments was equated in the separated and contiguous conditions. The distance from the left tip of the left-most segment to the right tip of the right-most segment of stimuli in the contiguous condition was equal to the distance from the top of the top-most doodle to the bottom of the bottom-most doodle in the separated condition.

Insert Figure 4 about here

Procedure. The procedural closely followed the procedure used in Experiment 1, with the following exceptions. The participants were evenly divided into four groups created by factorially combining the two levels of stimulus appearance (Contiguous or Separated) with two levels of task type (All or One). In all cases, the stimuli were configured as they were in the Ordered condition of Experiment 1. That is, the Item ABCDE always had the same components A, B, C, D, and E in the same relative positions on each trial.


The results of primary interest concern the correct response times for categorizing the Item ABCDE into Category 1. These response times are shown in Figure 5. A 2 (Separated vs Contiguous display) X 2 (All vs One component necessary) X 4 (blocks) ANOVA was conducted on these data. A main effect of display was found, F(1, 69) = 7.2, mse = 14.4, p < .01, such that Contiguous displays were more quickly categorized than Separated displays. A main effect of task was found, F(1, 69) = 18.2, mse = 12.9, p < .01, with the One condition significantly faster than the All condition. Finally, a main effect of blocks was found, F(3, 207) = 11.9, mse = 168.2, p < .01, indicating that practice increased speed of categorization.

Insert Figure 5 about here

There was also significant three-way interaction between these three variable, F(3, 207) = 3.3, mse = 9.5, p < .05. Although not obviously apparent in Figure 5, this interaction was due to a particularly strong practice effect in the All, Contiguous condition. The two All conditions became more separated with practice than did the two One conditions. The magnitude of this three-way interaction was several times smaller than it was in Experiment 1. If we restrict our attention only to the two All conditions, a significant interaction between blocks and display was found, F(3, 71) = 3.9, mse = 11.2, p < .05. This interaction indicates that although the two All conditions appear roughly parallel after the first block of practice, the contiguous condition showed greater improvement with practice than did the separated condition.

Although cross-experiment comparisons are not strictly appropriate, it is notable that the practice effect for the Separated displays of Experiment 2 was much more pronounced than the practice effect for the Random displays of Experiment 1, F(3, 207) = 6.8, mse = 15.9, p < .01. As such, it appears that randomizing the order of components within a stimulus interfered with learning more than did separating the components but retaining their positions.

In the All condition, response times for the other items were faster than for "ABCDE." Overall, collapsing across the contiguous and separated groups, the Category 2 items were responded to in 1269 msec, which is significantly different from the 1387 msec required to respond to "ABCDE," paired T (75) = 8.2, p < .01. Similarly, although "ABCDE" occurred more frequently than "VWXYZ," the average correct response time for categorizing the latter object, 1196 msec, was significantly faster, paired T (75) = 6.4, p < .01. Consistent with the unitization hypothesis, the speed advantage of Category 2 items over "ABCDE" in the All condition decreased over blocks of practice, as shown by a significant Item X Blocks interaction, F(3, 103) = 8.0, mse = 11.8, p < .01. That is, over practice, "ABCDE" becomes relatively quickly categorized when compared to the other items.

As with Experiment 1, the response time effects were mirrored by error analyses, although the latter are generally not significant because of the overall high level of accuracy. In no case was there any indication of a speed-accuracy tradeoff.


The results indicated an intermediate degree of unitization for displays from the All condition that had separated components. On the one hand, the contiguous display showed greater improvement than did the separated condition. The effect size of this interaction was not large, but was statistically reliable. On the other hand, the separated condition showed much more improvement with practice than did the random order condition from Experiment 1. The clearest result from Experiment 2 is that physically separating components that must be integrated for a conjunctive judgment does not interfere with the unitization process to nearly as great extent as does randomly positioning these components.

The strong practice effects that were found in the separated condition support the notion that practice effects in a conjunctive task predominantly depend on establishing a consistent image-like representation for the unit to be conjoined. Consistent with Czerwinski et al's (1992) results, the current results suggest the occurrence of unitization despite the lack of physical contiguity. Unitization seems to strongly depend on the consistency with which a conjunctively defined item is rendered, at least in conditions where the physical separation between the ends of the stimuli are equated across the contiguous and separated conditions.

The separated condition is helpful in assessing explanations of the observed practice effects that do not posit unitization. One such explanation is that participants learn, with time, to attend junctions between components. For example, the small junction between components "A" and "B" provides information about both components. Given that the five components join seamlessly in the contiguous condition, it can be plausibly argued that participants are not integrating evidence from precisely the five components as defined by the experimenter, but are rather attending to different regions of the curve that segment the curve in a different fashion. Participants could learn to correctly classify "ABCDE" by identifying and integrating three such regions: the junction between "A" and "B," the junction between "C" and "D," and the junction between "D" and "E." Performance on the "ABCDE" stimulus may improve gradually because participants identify useful junctions that provide as much information as two experimenter-defined components.

The separated condition places constraints on the importance of this type of learning. Given that the separated condition does not create diagnostic junctions between components, if identifying junctions was the main cause of practice effects, then we would expect large differences between the separated and contiguous conditions. Junction identification may still be occurring, and may explain the small but significant difference between the two display conditions. Still, robust practice effects do not depend on diagnostic junctions being found. As such, the results support unitization of single coherent images rather than the analytic integration of optimally diagnostic stimulus regions as the primary explanation for the strong practice effects that were observed.

Experiment 3

In Experiment 2, the cohesiveness of the visual components to be integrated was manipulated by physically concatenating or separating them. In Experiment 3 a second method for potentially manipulating the ease of integrating components was developed. In this experiment, reliable categorization required attending to information from two experimenter-defined components. These components were always found within a five-segment curved contour. For one group, the two diagnostic components were next to and touching each other. For the other group, the two components were separated by a third component. Categorization times and practice effects were thus compared between situations where components that must be attended were together or apart.

The Together and Apart conditions in Experiment 3 controlle for different stimulus aspects than were controlled in Experiment 2. Experiment 2's separated and contiguous groups were equated for entire stimulus length, and maximal length between neighboring components. Experiment 3's groups were equated for curve contiguity but not length. As such, Experiment 3 tested whether the physical separation of components influences their unitizeability even when the components were joined together within a physically contiguous object. Two components were separated but part of a contiguous object by inserting a nondiagnostic component between them.

If the Together and Apart conditions differ, there are reasons to think that they may differ either by an absolute amount or that the degree of difference will be modulated by practice. The Together condition may show a persistent advantage across blocks of practice given that the Apart condition requires integration of information over a wider region, and may require greater eye movements. However, the Apart condition may also yield larger practice effects than the Together condition. One way of accomplishing the Apart task is to look for the conjunction of three contiguous segments - the two diagnostic components and the nondiagnostic component that lies between them. If this occurs, then greater improvement may be found in this task than in the Together task, under the assumption that unitizing a larger number of components requires a greater amount of practice. Of course, these hypotheses are not mutually exclusive; the Together task may continue to show an advantage over the Apart condition despite a greater influence of practice on the latter task.


Participants. Forty-eight undergraduate students from Indiana University served as participants in order to fulfill a course requirement.

Materials. The design and construction of the materials was highly similar to the previous experiments. Like Experiment 1, but unlike Experiment 2, the five components that comprised a curve were joined together by a U-shape to create a closed figure. Examples of the two conditions, Together and Apart, are shown in Figure 6. For both conditions, the objects belonging to Category 1 can be expressed as "ABCDE," signifying that five components were in the stimulus, and they were constrained so that each was unique, chosen randomly from a set of 10 components. Unlike the previous experiments, only this one item belonged in Category 1; there was no "VWXYZ" item in Category 1. Two objects belonged in Category 2, and each of these objects differed from "ABCDE" along a single component. The components along which the Category 1 items differed from the Category 2 item can be called "diagnostic" because they distinguished between the categories. In the Together condition, the two diagnostic components were adjacent. In the Apart condition, the two diagnostic components were separated by a third nondiagnostic component that lied between them.

Figure 6 shows an example of the category items from the Together and Apart conditions. As shown in this figure, in the Apart condition, components "D" and "E" must be attended (or the junction between them) in order to reliably categorize "ABCDE" into Category 1. In the Together condition, components "C" and "E must be attended.

Insert Figure 6 about here

Figure 6 shows only one possible set of diagnostic components. Each participant in the Together condition was randomly assigned one of the four possible sets of adjacent segments to be the diagnostic segments, and each participant in the Apart together was randomly assigned one of the three sets of segments that were separated by one other segment: A and C, B and D, or C and E.

Procedure. The procedure closely followed the procedure used in the previous experiments. In all cases, the stimuli were configured as they were in the Ordered condition of Experiment 1. That is, the Item "ABCDE" always had the same components A, B, C, D, and E in the same relative positions on each trial. At the beginning of the experiment, participants were shown the objects that belonged to Category 1 and Category 2. Participants were given 320 trials in all. On each trial, one of the three objects was randomly selected (one in Category 1, and two in Category 2), and displayed in a random location on the screen. Participants' speed and accuracy at categorizing the object was recorded. The experiment required about 55 minutes.


As with Experiments 1 and 2, response time rather than accuracy was the more sensitive measures of participants' performance, and analyses focused on responses to the conjunctively defined object that belonged to Category 1. Figure 7 shows response times for correct categorizations of "ABCDE." These results indicated a significant main effect of display type, F(1, 47) = 18.4, mse = 14.0, p < .01. The average response time in the Apart condition was 991 msec, compared to 757 msec in the Together condition.

A significant Block X Display Type interaction was also found, F(3, 141) = 8.3, mse = 10.4, p < .01, and is depicted in Figure 7. As shown in this figure, the Apart condition yielded a greater practice effect than did the Together condition. During the first block, the response time difference between the Apart and Together conditions was 290 msec. This difference was reduced to 165 by the final block.

Insert Figure 7 about here

Response times were faster when the left-most or right-most component was diagnostic than when only middle components were diagnostic, F(1, 47)= 4.5, mse = 14.3, p < .05. If anything, this effect should have benefited the Apart condition more than the Together condition because 2 out of 3 of the configurations in the Apart condition had a diagnostic extreme component, whereas only 2 out of 4 of the Together configurations do.


The results indicated both that a greater practice effect was found for the Apart than Together condition, and that this practice effect was not sufficient to make these tasks equated by the end of the experiment. That is, there was no evidence in the data that the practice effects were sufficiently large to eliminate the difference between display types, at least not after an hour of training.

The first result, that practice has a stronger influence on the Apart than Together task, is consistent with the hypothesis that participants approach the Apart task by developing units that span three components. While the Together task can be accomplished by conjoining two contiguous components, the Apart task must be accomplished by either integrating evidence from two separated components, by creating a single unit that is not contiguous, or by creating a single contiguous unit that spans three contiguous components. The first possibility for the Apart task is eliminated by the pronounced practice effects that were observed. These practice effects are on the same order as those produced in Experiments 1 and 2 when unitization was possible, and are much larger than those found when responses required the integration of separate pieces of evidence. The second account for the Apart task is at odds with the results from Experiment 2. Experiment 2 did show that units could be formed from disconnected segments, however it also showed that the disconnected condition benefited less from practice than did the contiguous condition. Presumably, the attenuated practice effect was due to less coherent units being employed in the disconnected than contiguous condition. In contrast to these results, in Experiment 3, the Apart condition showed greater practice effects than did the Together condition. Thus, participants in Experiment 3 were probably not treating the Apart condition by unitizing disconnected segments; if they had, weaker, not stronger, practice effects would have predicted for this condition.

As such, the remaining hypothesis, that participants come to treat the Apart task by creating a three-component unit, seems to capture the current results and their relation to previous experiments most parsimoniously. By this account, the larger practice effect in the Apart than Together condition is due to the extended practice required to fuse a greater number of components together. The results are, in one sense, opposite to those produced in Experiment 2. In Experiment 2, the easier Contiguous task shows the greatest improvement with practice, while in Experiment 3, the easier Together task improves least. These contradictory results are reconciled under the hypothesis that practice effects reflect the course along which unitization benefits performance. Unitization can facilitate Contiguous more than separated stimuli (Experiment 2) because somewhat stronger units can be formed if the components create a contiguous whole. The course of facilitation is more protracted for Apart than Together stimuli (Experiment 3) because units take longer to be formed in the Apart condition. At the very least, the comparison of Experiments 2 and 3 show that differential practice effects are not simply due ceiling/floor effects. Easier tasks may be either more, or less strongly, affected by practice.

our account of Experiment 3 implies that people have a bias to form coherent, contiguous units. Experiment 2 shows that units can be formed when components are not contiguous, but Experiment 3 suggests that when components can be made contiguous by including extraneous, nondiagnostic information, then participants do so. This conclusion from Experiment 3 is consistent with work on attention suggesting that there is a strong bias for attention to be allocated to spatially contiguous regions (LaBerge & Brown, 1989; Eriksen & Murphy, 1987). It is difficult or impossible for people to attend to two separate regions without also attending the region in between.

Although there was a trend for the Apart and Together conditions to converge, it appears that even at asymptote (not reached by the end of the experiment), there would still likely be a difference between the conditions. In fact, after the first block, the two conditions show approximately parallel influences due to practice. Consequently, the experiment does not imply that identifying a three-component unit will ever be as fast as is identifying a two-component unit. It might be argued that if a functional unit is really created, then it should not matter how many actual components are involved. This strong version of a unitization hypothesis is probably false. Just as all primitive features may not be equally quickly processed, constructed features (units) may also differ in how quickly they are processed, and it would be surprising if the actual physical space subtended by a unit did not affect its processing time.

Experiment 4

Experiment 4 provides a parametric investigation of the relation between response time and the number of components that must be integrated to make a categorization. In essence, the intermediary cases between Experiment 1's All and One task were included, wherein participants must integrate information from 2, 3, or 4 out of the 5 components present.

The first goal of the experiment was to provide boundary conditions on the unitization process. Given components of a particular level of complexity, how many components can be unitized, and how long does it take to effectively unitize them? Although any results obtained will depend critically on the level of complexity of the primitive components, useful information can be obtained by observing whether linear changes in the number of segments to be combined give rise to linear changes in response time and practice effects. An upper limit in the number of segments that can be unitized should be revealed by a flattened practice effect. That is, if units that combine N segments cannot be created because they surpass a capacity limitation, then practice effect across blocks should resemble conditions where unitization is difficult (e.g. Experiment 1, random order).

The second goal of the experiment is to explore the hypothesis, recruited to explain the difference between Together and Apart conditions in Experiment 3, that the time course of improvement for a conjunctive categorization task is positively related to the number of components that must be conjoined. By this hypothesis, the fewer components that must be attended, the earlier and shallower practice effects should be.


Participants. Eighty-five undergraduate students from Indiana University served as participants in order to fulfill a course requirement, and were equally divided into the five levels of the between-participants factor "number of components."

Materials. The materials were similar to those used in previous experiments. All stimuli were composed out of five segments. Five conditions were included, in which the number of components that were required to make a reliably accurate Category 1 response were 1, 2, 3, 4, or 5. Thus, the "1" and "5" conditions were similar to the ordered One and All conditions from Experiment 1. The single item that belonged to category 1 can be represented as "ABCDE" for all five conditions. For the 1 condition, only a single item belonged in Category 2: "ABCDZ." For the 2 condition, two items belonged in Category 2: "ABCDZ" and "ABCYD." For the 3 condition, three items belonged in Category 2: "ABCDZ," "ABCYE," and "ABXDE." For the 4 condition, four items belonged in Category 2: "ABCDZ" "ABCYE," "ABWDE," and "AVCDE." For the 5 condition, all five items shown under Category 2 in Figure 1 were used. Thus, unlike previous experiments, the location of the diagnostic segments were not randomized. Whenever a segment was diagnostic for Category 1, all of the segments to the right of the segment were also diagnostic. In addition, the diagnostic segments always formed a contiguous curve, without any nondiagnostic intervening segments.

Procedure. As before, the participants' task was to categorize objects into Category 1 or Category 2 as quickly as possible, while maintaining category accuracy of at least 95%. Prior to the experimental trials, participants were presented with the objects that belonged to the two categories and their category memberships. Participants were presented with 400 trials in all, with equal numbers of Category 1 and Category 2 trials randomly intermixed.


Once again, response accuracies generally mirrored response times, but were less statistically sensitive to difference between conditions because of their restricted variability. For the primary analyses, only correct response times for the single Category 1 item were included. The response times for the five conditions, broken down by to block, are shown in Figure 8. A strong main effect of condition was found, F(4, 81) = 132, mse = 14.2, p < .01. Post-hoc tests revealed that all five conditions differed significantly from each other, p < .01. One particularly surprising result was that the 5 condition yielded significantly faster response times than did the 4 condition. Although this effect was not predicted, one explanation for it may be that participants in the 5 condition switched to the strategy of looking for the whole item "ABCDE" sooner than did participants in the 4 condition; from the instruction page itself, it would have been evident to these participants that all components were relevant for the categorization, and consequently they may not have initially attempted an analytic component-by-component integration. In addition, as Experiment 3 showed, participants were faster to respond when diagnostic components were positioned on the left- or right-most edges of the stimulus. Given that the left-most edge was diagnostic for the 5 condition but not the 4 condition, the diagnosticity of this highly salient component may have served as an additional impetus to create a unit that spanned the entire stimulus.

Insert Figure 8 about here

A strong main effect of block was also found, F(3, 243) = 87, mse = 9.5, p < .01, with all four of the blocks differing significantly from each other according to post-hoc tests. In addition to the two main effects, a significant interaction between block and condition was also found, F(12, 243) = 8.3, mse = 16.3, p < .01. As shown in Figure 8, this interaction can generally be characterized by a convergence of the five conditions over practice. There is a 946 msec difference between the conditions during the first block, and this reduces to a 490 msec difference at the final block. This difference cannot be explained away by using a ratio instead of an interval scale for evaluating practice effects. Over the course of practice, the 4 condition loses 38% of its initial value, while the 1 condition loses only 26% of its value.

Tasks that required the integration of a larger number of components showed more prolonged practice effects than those that required fewer components. The 1 condition showed little practice effect after the first block. The 2 condition continued to show a practice effect after the first block, but there was a distinct elbow in the curve after the first block. The elbow in the 4 and 5 conditions was much more attenuated. For example, the response time difference between Blocks 1 and 2 were roughly comparable for the 2 and 4 conditions, at 223 and 247 msec respectively. However, the difference was much larger for the 4 condition (207 msec) than the 2 condition (64 msec) going from the second to third block. This interaction also speaks against the claim that the differential practice effects were simply due to slower tasks having farther to fall in terms of response time. In fact, when the slower tasks have the farthest to fall, at the beginning, they showed comparable falling rates to the faster tasks. The differences in falling rates between conditions were increased with ongoing practice. The dependent variable "falling rate" was derived by finding response times differences between adjacent blocks. Submitting falling rate to a block X condition ANOVA revealed a significant interaction, F(8, 162) = 7.1, mse = 6.2, p < .05, indicating that falling rate for many-component tasks was particularly large, relative to few-component tasks, as blocks increased.


Experiment 4 confirmed the hypothesis that was introduced to explain the results of Experiment 3 -- as the number of segments that must be integrated increases, so does the size of the practice effect. This was tested parametrically in experiment, and a nearly monotonic relation was found between the magnitude of the practice effect and the number of segments to be conjoined. The one unpredicted and surprising violation of this generalization was that the four segment task showed more facilitation with time, and longer initial response times, than did the five segment task. A post-hoc explanation for this violation may be that when a highly salient (e.g. see the results from Experiment 3) end segment is diagnostic for the conjunctive response, and all of the others segments are also diagnostic, participants have an early disposition to accomplish the task by matching each stimulus to their "ABCDE" image. This strategy, initiated early, may be more effective than a component-by-component integration strategy, which participants in the four-segment condition are more likely to adopt.

In addition to showing generally larger practice effects as the number of conjoined components increased, Experiment 4 also showed more protracted practice effects as the number of components increased. As such, learning is more gradual as the unitization task involves more components. This suggests one type of boundary condition/bottleneck on the unitization process. How quickly unitization can proceed depends on how much there is to unitize. This constraint argues against a radical version of the hypothesis that unitization simply involves creating a photograph-like image of the unit, and matching incoming stimuli to this image. Photographs do not take longer to develop if they involve more complex scenes, but the unitization process observed in Experiment 4 does depend on the complexity of the unit to be formed. A unit, once formed, may have image-like properties, but during its formation, it probably requires componential processing. The eventual image-like property that is observed in the units is the relative independence of categorization time on complexity, although complexity always exerts some influence. However, this end-state property seems to depend on a learning process that is quite sensitive to stimulus complexity.

There are strong boundary conditions in terms of how much unitization can occur within a given period of time. Given components of the complexity used in Experiment 4, very roughly, one component can be unitized in 100 trials (the 1 condition), two components can be unitized in about 200 trials (the 2 condition), and unitization of four or five components requires about 400 trials. These numbers are very rough, given the difficulty in finding a singular elbow in practice effect curves, and the lack of a definitive criterion for when unitization has started/finished. This boundary condition on the number of components that can be integrated per trial stands in contrast to the lack of constraint that was observed in terms of the number of components that can be unitized. A failure of unitization would be shown by decreasing practice effects at some point, as number of components increased. Reduced practice effects were observed earlier when unitization was disrupted by randomly ordering components. The nearly monotonic trend for practice effects to increase as number of components increased suggests that if there is a limit to how many components can be unitized, it is greater than five. This result indicates an impressive unitization proficiency, given the complex nature of the components themselves.

Experiment 5A

The purpose of Experiment 5 was to compare an unitization account of categorization to specific analytic models. Thus far, large and gradual improvements in response times have been taken as evidence of unitization. This is also the type of evidence that has been generally used to argue for unitization in attention research (Czerwinski et al, 1992; LaBerge & Brown, 1989). However, it is possible to develop more specific analytic models of categorization, and test whether violations of these models are found. Thus, the two types of models that will be compared in Experiment 5 are an unitization model that assumes that a single functional unit is created for the set of five components composing a stimulus, and an analytic model that assumes that categorization decisions are based on the integration of five separate judgments about the components. The analytic model differs from the unitization account in that it does not assume the construction of functionally new features to subserve categorization. The analytic account can certainly predict improvements at responding to a conjunctively defined object, but these improvements would have to be due to improvements in integrating, identifying, or registering components.

The general strategy in testing the analytic model is to charitably interpret the analytic model, giving it several information processing advantages. Given these advantages, if categorization responses are still faster than predicted by the analytic model, then the analytic model would be discredited. The advantages that will be given to the analytic model are unlimited capacity processing, and complete parallelism (Townsend, 1990). By unlimited capacity, the time required to identify one component is not slowed by the simultaneous requirement to look for another component. By complete parallelism, any number of components can be identified simultaneously. Both of these assumptions will serve to facilitate processing of the analytic model. Thus, even if they are unrealistic, the assumptions will err on the side of making the analytic model predict conjunctive categorizations that are too fast rather than too slow.

Intuition might tell us that an analytic model endowed with parallel, unlimited capacity processing would predict equal response times in the One and All tasks after practice. Surprisingly this is not the case, by the following logic: A) there is natural variability in response times even in the One task, and B) the All task is an intrinsically conjunctive task. The prediction from the analytic model can be obtained by deriving the empirically obtained response time distribution for the One task. Then, five randomly sampled times from this distribution can be selected, and the maximum of these times will be determined. The maximum, rather than the mean, is used because a response in the All task cannot be made until all five curves are detected in the object. The predicted response time distribution for the All task can be established by taking multiple random samples of five response times. Thus, we have a formal way of predicting the response time distribution in the conjunctive task, based on the response time distribution in the task that requires identification of only a single component. If this is done, then despite the charitable processing assumptions, this analytic model predicts response times in the conjunctive task to be slower than response times in the simple task, for the simple reason that the maximum of several samples of a random variable will be larger, on average, than the average of the samples.

Although it would be possible to derive the analytic models' prediction for the All task by randomly sampling set of five times from the One task's distribution (as described above), fortunately there is an easier and more precise way to derive predictions. The analytic models' prediction for the All task can be found by computing the cumulative response time distribution for the One task, and then raising each value on this curve to the fifth power. The resulting curve represents the analytic model's predictions for the All task, assuming that there are five components to identify. An example can provide the intuition behind this logic. If the probability of a response time being less than 300 msec. in the One task is .25, then the probability of five independently sampled response times from the same distribution being less than 300 msec is .00098 (.255). The conjunctive All task requires that all five components be identified before categorizing the Item "ABCDE" into Category 1. As such, the analytic model predicts that the reliably correct categorization of "ABCDE" should be made within 300 msec less than 1 time out of a thousand.

The primary method of testing the analytic model will be to compare its prediction for the All task's RT distribution (derived from the One task) to the empirically observed distribution in the All task. A non-parametric Komolgorov-Smirnoff test for equal distributions can be used to see whether the analytic model is significantly violated.

In order to potentially find violations of the charitably interpreted analytic model, far more practice will be needed than was used in the previous experiments. The formal analytic model was tested on the results from the previous experiments, and no violations in the direction of unitization were found. In fact, the analytic model predicted far faster response times for the All task than were obtained. However, this is not surprising, given the liberally interpreted (allowing for complete parallelism and unlimited capacity) nature of the analytic model tested. To provide a situation in which the RT distributions could possibly be faster than predicted by the analytic model, participants in Experiment 5 were given extended practice over more than 15 hours and 8000 trials.


Participants. Four undergraduate research assistants from Indiana University were used as participants in this experiment. All four participants were naive with regard to the hypothesis being tested. The research assistants ran themselves in the experiments.

Materials. The materials were similar to those used in previous experiments. For the All tasks, one category contained "ABCDE" and the other category contained all five of the one-component distortions from "ABCDE." In the One task, one category contained "ABCDE" and the other category contained only one of the five one-component distortions from "ABCDE." Given the small number of participants used in the experiment, a different critical segment was assigned to each participant in the One task. As with the previous experiments, the actual instantiation of the components was randomized, under the constraint that none of the components used in the All task were used in the One task. To meet this constraint six new components were created. Once a random assignment of physical components to experimental letters was created, the assignment were retained for all of the sessions that a participant completed.

Procedure. The basic procedure from Experiment 1 was used. Participants categorized objects as quickly as possible. Unlike previous experiments, the type of task (One vs All) was treated as a within-subject variable. Each session consisted of 500 trials, and lasted approximately one hour. Participants completed two sessions in one day, separated by a break of at least 10 minutes. The two sessions within one day were always of the same task type. Participants were instructed to alternate task types on successive days. Two of the participants began with the One task, and the other two began with the All task.

Three of the four participants completed 8 2-hour days of experimentation, and the fourth completed 10 days. The interval between days of experimentation varied, but the entire experiment was completed within 18 days for all participants.

Within each session, participants saw an equal number of Category 1 and Category 2 trials. In the All task, on trials where a Category 2 item was to be selected, it was randomly selected from one of the five alternatives. Participants received trial-by-trial and block-by-block feedback on their categorization accuracy and response time. Participants were told to respond as quickly as possible while maintaining an error rate less than 5%.


Unless stated otherwise, all of the results include only correct Category 1 trials because these are the only trials that require a five-way conjunction in the All task. Figure 9 shows each of the four participants' average response times across the sessions. The results from all four participants showed large practice effects, and a significant interaction between sessions and type of task, F(3, 2997) > 25, p < .01. Replicating the previous experiments, the All task showed much larger practice effects on RT than did the One task. Two of the participants showed no systematic practice effect on the One task at all.

Insert Figure 9 about here

On the final sessions of training, a statistically reliable advantage for the One task over the All task was still found for all four participants, F(1,999)>4.5, p < .05. However, despite the faster average response times in the One task than in All task, the analytic model may still be violated. To test the analytic model for the All task, the cumulative response time distributions for the final sessions of the All and One task were calculated for each participant. These cumulative distributions are shown in Figure 10. Given assumptions of unlimited capacity and complete parallelism, the analytic model's prediction for the cumulative distribution of All task times is equal to the One task's cumulative distribution raised to the fifth power. These predictions are also shown in Figure 10. Inspection of Figure 10 reveals that for two of the participants (A.H. and N.T.), the actual cumulative distribution for the All task was shifted to the left of the analytic model's prediction. This dominance relation indicates a violation of the analytic model in that the All task was faster than predicted by the analytic model. For the other two participants (C.H. and D.M.), the cumulative distributions for the All task and the analytic model cross at an intermediary response time. For these two participants, responses in the All task were faster than predicted by the analytic model, but only if attention is restricted to the fastest set of response times. This range-dependent violation of the analytic model could potentially be accounted for by fast guesses in the All task, but this account is made less plausible by the overall high categorization accuracies. D.M., N.T., C.H., and A.H. attained 96.4%, 94.6%, 97.5%, and 96.3% accuracy rates respectively on the final sessions of the experiment.

Insert Figure 10 about here

The results from all four participants indicated faster than predicted response times if we restrict our attention to response times that are faster than average. To test whether this violation is statistically reliable, a Komolgorov-Smirnov test was conducted (Amssey, 1951). This test finds the maximum discrepancy, D, between two cumulative distributions. The D statistic can be converted to an approximate chi-square variable by calculating


with 2 degrees of freedom, where the Ni is the sample size of distribution i. For the two participants whose All task dominated the analytic prediction derived from the One task, the chi-squared values were 10.2 and 13.8 for A.H. and N.T. respectively, p < .01. For the other two participants, the Komolgorov-Smirnov test were range restricted to cumulative probabilities less than 0.5, as described by Birnbaum and Lientz (1972). With this restriction in place, the test statistic remained significant at the p < .01 level for D.M. and at the p = .08 level for C.H. As such, the fastest half of All task response times were significantly faster than is predicted by the analytic model for three participants, and marginally faster for the fourth participant.

Comparisons of the All task's response time distribution to the analytic model were also made for the earlier sessions. For the first four sessions (two All task sessions and two One task sessions), the analytic model predicted far faster response times than were observed in the All task distribution for all four subjects, as measured by a Komolgorov-Smirnov test. The earliest violation of the analytic model was found for D.M. on the third session of the All and One tasks. For all of the other participants, the only reliable violations were found starting on the fourth sessions.


Despite the charitably interpreted analytic model, responses in the All task were reliably faster than predicted by the analytic model in certain circumstances. In particular, for the fastest half of the response times in the final sessions, the All task's cumulative response time distribution was reliably shifted to the left of the analytic model based on the corresponding One task's distribution. The two conditional restrictions will be discussed separately.

First, why were the violations of the analytic model restricted to, or at least maximized at, the fast response deadlines. One likely possibility is that a range of strategies was used for placing "ABCDE" into Category 1 in the All task. On some trials, an analytic strategy of combining evidence from separately detected components might have been used. On other trials, participants may have detected a single constructed unit. On trials where a participant used the analytic strategy, the charitably interpreted analytic model would be expected t o underestimate observed response times, given the implausibility of pure parallel, unlimited capacity processing. However, on trials where participants used the single constructed unit to categorize "ABCDE," violations of the analytic model are predicted. Participants would be expected to respond as soon as either the analytic or unit-based process produced a categorization (Logan, 1988). On average, the unit-based trials will be faster than the analytic trials. That is, if a participant successfully uses a single unit to categorize "ABCDE," then they will tend to do so quickly. If they cannot use this route, then their response time will tend to be slower. Thus, if the fast and slow response times tend to be based on single units and analytic integration respectively, then we would predict violations of the analytic model to limited to, or more pronounced for, the fast response times.

Second, why were the violations of the analytic model restricted to the final sessions of training? By our unitization account, the analytic strategy would be the only strategy available to participants early in training, but as training continues participants would be expected to categorize according to the single unit with increasing frequency.

It is important to remember that underestimation and overestimation of All task response times by the analytic model are not equally problematic for the analytic model. Underestimations can simply be explained by lack of pure parallel, unlimited capacity processing. Underestimations are consistent with participants adopting an analytic strategy of combining evidence from five separately detected components, but with limited capacity or imperfect parallelism. However, overestimations are problematic for an analytic account, and suggest that a different process leads participants to categorize "ABCDE" quickly. This alternative process seems to involve the construction of a single functional unit that can be registered in a fashion that does not involve combining five independent detections of one component.

Experiment 5B

The analytic model tested in Experiment 5B predicted All task performance by combining five separate One task judgments. Algebraically, analytic predictions for the All task were obtained by raising individual points on the One task's cumulative response time distribution to the fifth power (because there are five components to identify). The psychological assumptions underlying this algebraic treatment are: pure parallelism, unlimited capacity, and independently sampled response times. The first two assumptions are not problematic; they cannot be responsible for the analytic model predicting response times that are too slow because, if anything, they produce response times that are faster than they would otherwise be. The goal of Experiment 5B and its associated mathematical analyses is to relax the third assumption of independent sampling.

It is first necessary to understand what the assumption of independent sampling means. According to the analytic model, five separate component identifications occur on every All task trial. According to independent sampling, the time required to identify one component is independent of the other identification times. Independence may be violated by either positive or negative dependencies are possible. Negative dependencies would be expected if there were competition between components to be identified. If there were limited resources available for identifying components, then one component might be identified quickly if substantial resources were devoted to it, but this fast identification would come at the expense of the other components. Such negative contingencies are not problematic for the conclusions raised in Experiment 5A. If negative contingencies occur then the algebraic formulation of the analytic model underpredicts actual analytic response times. If there are negative contingencies between sampled response times then the maximum of the five response times will be large relative to when there are no contingencies. For example, the maximums of the sets {1, 5, 10} and {10, 5, 1} are each 10 - a large number because of negative contingency between the first and third number. Thus, negative contingencies would yield even slower analytic predictions than those shown previously, entailing even more dramatic empirical violations of the analytic model.

However, if there is a positive correlation between identification times, then the analytic model could be faster than presented earlier. In fact, if there were perfect correlation between the sampled times, then the analytic model predicts that the All task would be performed as quickly as the One task. There are two important situations where positive contingencies are expected. Positive contingencies have been empirically observed when configural properties can be used instead of individual components (Townsend, Hu, & Kadlec, 1988). If this type of positive dependency is at play in Experiment 5A, then the conclusions are not in jeopardy; this type of positive dependency relies on an explanation very similar to the one presented; both argue for processing at a configural level above the individual component.

A second mechanism that produces positive dependencies is through shared input-output processes. Imagine, for example, that on half of the One and All task trials, the response keyboard was moved ten feet away from participants. The One task's response time distribution would probably be distinctly bimodal, with half of the responses clustered around 1 second (the average detection time) and the other half clustered around 10 seconds (the average time to detect a component and walk toward the keyboard). In developing the analytic model's prediction for the All task, it would be inappropriate to sample five response times from the entire distribution. If this were done, then the maximum of the five times would be around 10 seconds about 97% of the time (1 - (.55)). This is inappropriate because it takes 5 samples from an underlying distribution that include input and output processes as well as detection times. A more appropriate analytic model would sample 5 times from the distribution of the detection times and take the maximum, but then combine this time with a single sampled response time from the input and output distribution.

Fortunately, recent advances have provided a method for separating out various processes that make up a whole response time. The details of this technique are provided by Smith (1990). The technique has been assessed, applied, and critiqued by (Sheu & Ratcliff, 1995). If a task requires two processes, A and B, and these processes each take a constant amount of time, and the time required for A is known, then to determine the time required for B one simply subtracts the time required for the A task by itself from the time required for the A+B task, assuming factor additivity. However, if the processes are stochastic, then each process will be associated with a distribution of response times rather than a constant. In this case, in order to determine the distribution of the unknown B process, the distribution of times in the A task can be deconvolved from the A+B task distribution. Just as the blur can be removed from a picture by deconvolving a Gaussian curve from the blurred image, so the influence of a particular process can be removed by deconvolving the distribution associated with the process from the whole distribution. The standard technique for this deconvolution is to calculate a Fourier transform for the whole and component distributions. The basic mathematical property of the Fourier transformation is that it the converts convolution (or deconvolution) operation in the time domain into multiplication (or division) in the frequency domain. Using this property, the distribution associated with Task B can be determined as long as the A and A+B distributions are known, assuming factor additivity, by three steps: 1) take the Fourier transform of the A and A+B distributions, 2) divide the Fourier transform of the A+B distribution by the Fourier transform of the A distribution, and 3) take the inverse Fourier transform of the resulting quotient to derive the B distribution.

How is this technique used in eliminating certain types of positive dependencies between sampled response times? Positive dependencies due to shared input-output processes can be eliminated by obtaining an empirical distribution of input-output times for each of the four participants in Experiment 5A. This distribution is obtained by having these participants complete many trials of a simple detection task; the participant presses a specified key as soon as any curve is displayed. This task requires perceptual registration of the pattern and motor output, but does not require any decisions to be made based on the appearance of segments within the curve. This will be called the "simple detection" task. The general strategy, then, is to divide the One task into two processes: the input and output processes that should only be sampled once for each All task judgment, and the comparison/identification process which must be sampled five times according to the analytic model. To fully instantiate the analytic model, the following steps are taken: 1) Fourier transforms of the simple detection and One tasks are calculated, 2) both Fourier transforms are filtered, 3) the Fourier transform of the One task is divided by the Fourier transform of the detection task, yielding the Fourier transform of the comparison distribution, 4) an inverse Fourier transform is calculated to convert the comparison distribution back to the time domain, 5) this new distribution is converted to a cumulative distribution, 6) every point on this cumulative distribution is raised to the fifth power to derive the analytic model's prediction, 7) the resulting distribution is convolved with the cumulative distribution for the simple detection task, and 8) the resulting cumulative distribution is compared to the empirically obtained All task distribution from Experiment 5A.


Participants. The four undergraduate research assistants who participated in Experiment 5 were used again. This experiment was conducted after each participants completed Experiment 5A.

Materials. The only stimulus in the experiment was the particular "ABCDE" curve that was used in the All task of Experiment 5 for a participant.

Procedure. At the beginning of each trial, participants saw either a '1' or '2' displayed on the screen, and they were instructed to press the key designated by the '1' or '2' regardless of the appearance of the stimulus. The curve's location on the screen was randomized as it was in the previous experiments. Whether '1' or '2' was the designated response was also randomized.

Each session consisted of 2000 trials. Although ideally participants would have completed 8 hours of this simple detection task, it was clear after the second session that performance was not changing with practice. As such, in obtaining participants' distributions, only the final session of 2000 trials was used.


The average response times for participants A. H., C. H., D. M., and N. T. were 223, 186, 203, and 245 msec. respectively. The average standard deviations for these response times were 54, 34, 41, and 59 msec respectively. In general, the distributions had a slight positive skew.

An example of the Fourier deconvolution technique is shown for participant C.H. in Figure 11. C.H.'s simple detection response time distribution is shown superimposed with his One task response time distribution from Experiment 5A. Fourier transformations were applied to each of these curves, and a parabolic filter with a cutoff of 20 was applied to each of the resulting frequency-domain representations. The One task transform was then divided by the simple detection transform, and inverse Fourier transform was applied to the result. The result of these steps is shown by the curve labeled "deconvolution" in Figure 11. This curve represents the portion of the One task that is due to the actual comparison processes involved in deciding whether a particular curve segment provides evidence for a Category 1 response. If the comparison process distribution were convolved with the simple detection distribution, the original One task distribution would be reconstructed closely. The mild sinusoidal component present in the comparison process distribution is an artifact of the Fourier transformation process, but introduces only a small amount of error in the reconstruction.

Insert Figure 11 about here

For each of the participants, the One task distribution was deconvolved into simple detection and comparison distributions. With these separated distributions, the analytic model was implemented by raising each point on the cumulative comparison distribution to the fifth power, and convolving this resulting distribution with the cumulative simple task distribution. This resulting distribution represents the analytic model's predictions for the All task, assuming 1) five times are sampled independently from the comparison distribution and the maximum time is selected, 2) only one time is selected randomly from the simple detection task distribution, and 3) these two selected times are additively combined to make the prediction for the All task. The analytic model's predictions are shown in Figure 12, and the observed results from the All task from Experiment 5A are superimposed. As with the earlier analysis, violations of the analytic model were found for the fastest half of response times for 3 of the four participants, Komolgorov-Smirnov p < .05. There was also a nonsignificant trend toward a violation of the analytic model for the fourth participant.

Insert Figure 12 about here


The conclusions from Experiment 5A remained unchanged despite a loosening of one of the assumptions of the analytic model. Responses in the All task were faster than predicted by an unlimited capacity, purely parallel, model that integrated responses from five separately registered components. Violations of this model were found assuming independence between the registration times for the five components, negative dependencies, and one class of positive dependencies -- namely, positive dependencies due to shared input-output processes.

The analytic model predictions derived from Experiment 5B are highly similar to those generated previously. This is hardly surprising given the relatively small amount of variance in participants' simple detection times. Although the deconvolution technique was necessary in order to eliminate the possibility of strong dependencies due to shared input and output processes, the results also suggest that for this type of task these dependencies can be safely ignored.

In applying formal models to tasks, assumptions must be made. At a broad level, the purpose of Experiment 5B was to eliminate the need for the assumption of independently sampled times. However, to eliminate this assumption, other assumptions were required. Sheu and Ratcliff (1995) have discussed the dependence of the Fourier deconvolution technique on factor additivity. Furthermore, only one variety of positive dependency was tested. The results from the All experiment can still be accommodated by an analytic model if the model is allowed to have positive dependencies between sampling times. In fact, Townsend et al (1988) have observed positive dependencies in some perceptual judgment situations. If one curve component is detected more quickly because another component has already been detected, then positive dependencies can arise. However, at this point, the analytic model becomes similar to the unitization account. Both assume that segments are not registered independently.

General Discussion

The five reported experiments explored speed increases in a conjunctively defined categorization task. These speed increases were hypothesized to be due to the development of functional units over practice. Pronounced improvements in categorization were found when, and only when, unitization was possible and advantage. Pronounced improvements were found in a categorization task that required all components of an object to be identified, but not when the task could be solved by attending to only a single component (Experiment 1). Greater improvements were found when the relevant components were joined together to form a coherent image rather than randomly ordered (Experiment 1). Larger improvements were found when the components were physically connected than when they were disconnected, although large improvements were found in both cases, consistent with the hypothesis that unitization depends critically on being able to form a single image-like representation (Experiment 2). Although Experiment 2 showed that units can be developed that span disconnected components, Experiment 3 showed that participants have a bias to create contiguous units if possible. When a category was defined by two diagnostic components with a third nondiagnostic component in between them, the relatively protracted course of improvement suggested that participants were creating units that incorporated all three components. This Experiment also showed that unitization effects were not well explained by identifying junction points between diagnostic components.

The amount and gradualness of improvement observed in a conjunctive categorization task was positively related to the number of components that needed to be attended (Experiment 4). Although this experiment did not find evidence for a capacity limit in the number of components that could be unitized (in the range from 1-5), it did find evidence for a boundary condition on the unitization process itself. In particular, the experiment suggested that there was a limit on the number of components that could be unitized within a given number of trials. Finally, Experiment 5 explored response times on a conjunctively defined categorization task following 16 hours of categorization training. The distribution of response times was compared to an analytic model derived from the distribution of times from a task requiring participants to attend to only one component. Responses were faster than predicted by this analytic model, particularly for the fastest half of responses. The violations of the analytic model were consistent with a unitization account that based categorizations on a single comparison to a developed unit rather than the integration of evidence from independently detected components.

Constraints on Unitization

The five experiments offer conclusions about the particular constraints or boundary conditions on the unitization process. Most clearly, from Experiment 1, pronounced improvements in speed were not found when the five components to be noticed were randomly ordered within an object from trial to trial. This suggests that whatever process is getting faster in the well ordered condition, the same process is not at work (to the same degree) in the randomly ordered condition. For example, the speed up in the well ordered condition is probably not due to speed ups in the time required to identify individual segments, to localize segments, or to integrate evidence from separate segments. These processes are all equally required in the ordered and random conditions, or are required more in the random condition. The primary candidate for a process that is involved in the ordered, but not random, displays is the construction of an unitary image for the conjunction of five components. The difference in speed up in the two conditions is consistent with unitization processes being restricted to situations where a single image can be formed.

The results with respect to image coherency are somewhat mixed. Experiment 2 showed large improvements for a conjunction defined by disconnected segments. This result argues against the speed ups observed in other experiments simply being due to increased efficiency at locating junctions between components. The disconnected stimuli have no junctions but still yield large improvements. These results are also consistent with Czerwinski et al's (1992) results suggesting unitization occurs in a conjunctive feature search task even when disjointed line segments were used as features. On the other hand, in Experiment 2 the connected condition led to greater improvements than did the disconnected condition, and the results from Experiment 3 indicated that participants created units that included irrelevant intervening segments. A reconciliation between these sets of findings is that the most important constraint on unit development is that they be representable by a single image. Only secondarily does the contiguity of the unit affect its creation.

A dynamic constraint on unit formation was suggested by Experiment 4 -- it takes a roughly constant number of trials to integrate a component of particular complexity into an unit. As the number of components that needed to be combined for a conjunctive categorization increased, so did the gradualness of improvement. Thus, higher-order conjunctions were more slowly categorized, but their rate of improvement was relatively protracted as well. This constraint indicates a way in which unit formation is not equivalent to the development of a photograph-like image. Essentially, unit formation is sensitive to the complexity of the unit. If unit formation were simply a matter of developing a photograph, then complexity would have little influence. Once formed, units may be relatively unaffected by complexity effects (Experiment 5), but the formation of the units is adversely affected by increasing the number of components to be combined. In short, unit formation seems to require much greater componential processing than does unit deployment.

Interpreting large improvements in a conjunctive task as evidence for unitization, Experiment 4 did not find capacity limits on unitization. Clearly, this may simply be due to an insufficiently large range of complexities tested. Capacity limits may exist, but may only be found with more than five components of the complexity used in experiments. Still, the result is reminiscent of findings suggesting indefinitely large short term memories as long as chunks can be formed (Ericcson, Chase, & Faloon, 1980).

Mechanisms for Improvement in a Conjunctive Task

Experiment 5 was able to successfully eliminate some broad classes of analytic models of categorization following prolonged practice. In particular, analytic models that integrated evidence from five separately detected components to make a conjunctive response were shown to predict responses times that were slower than those obtained. These violations were found only after several hours of practice, and were largest for the fastest response times. Despite the impressive violations, there are still two qualitatively different mechanisms that could account for the pronounced speed up of the conjunctive categorization: a genuinely holistic match process to a constructed unit, or an analytic model that assumes interactive facilitation among the component detectors.

According to a holistic match process, a conjunctive categorization is made by comparing the image of the presented item to an image that has been stored over prolonged practice. The stored image may have parts, but either these parts are arbitrarily small or do not play a functional role in the recognition of the image. There is a considerable evidence supporting the gradual development of configural features. Neurophysiological findings suggest that some individual neurons can represent conjunctions of features (Perett & Oram, 1993), and that prolonged training can produce neurons that respond to trained configural patterns (Logothetis, Pauls, & Poggio, 1995). Evidence from the recognition of faces (Tanaka & Gauthier), objects following prolonged training (Gauthier & Tarr, in press), trained letter-like stimuli (Shiffrin & Lightfoot, in press), and words (LaBerge & Samuels, 1974) supports the idea that entire configurations are learned with practice, and that detection of whole configurations is dissociable from detection of parts of the configurations.

However, despite the impressive improvements over time observed in Experiment 5, the results are also compatible with analytic models that assume interactions between the components (Mordkoff & Egeth, 1993; Townsend et al, 1988). For example, if detecting one component of "ABCDE" facilitates detection of other components, then positive dependencies among the detection times can result, violating the assumptions of the analytic model tested in Experiment 5A. Experiment 5B tests an analytic model that permits positive dependencies, and still finds violations of this model. However, in this latter model, the only positive dependencies allowed were those due to shared input-processes, rather than mutual facilitations. Although the experiments cannot rule out all analytic models, the experiments do place general constraints on models of conjunctive categorization following extended practice. Models that integrate independently registered components can be rejected. Well practiced categorizations either depend on strong mutual facilitations between component processors, or do not depend on component processing at all. In either case, the process is appropriately labeled "unitization" in that the percepts associated with different components are closely coupled together. In fact, an interactive facilitation mechanism could be seen as the mechanism that implements holistic unit detection at a higher functional level of description.


Amssey, F. J. (1951). The Komolgorov-Smirnov test for goodness of fit. Journal of the American Statistical Association, 46, 405-409.

Baylis, G. C., & Driver, J. (1993). Visual attention and objects: Evidence for hierarchical coding of location. Journal of Experimental Psychology: Human Perception and Performance, 19, 451-470.

Birnbaum, Z. W., & Lientz, B. P. (1972). Tables of critical values of some Renyi type statistics for finite sample sizes. American Statistical Association Journal, 42, 870-877.

Bruner, J. A., & Postman, L. (1949). Perception, conception, and behavior. Journal of Personality, 18, 14-31.

Czerwinski, M., Lightfoot, N., & Shiffrin, R.M. (1992). Automatization and training in visual search. The American Journal of Psychology, 105, 271-315.

Diamond, R., & Carey. S. (1986). Why faces are and are not special: An effect of expertise. Journal of Experimental Psychology: General, 115, 107-117.

Ericsson, K., Chase, W. G., & Faloon, S. (1980). Acquisition of a memory skill. Science, 208, 1181-1182.

Eriksen, C. W., & Murphy, T. D. (1987). Movement of attentional focus across the visual field: A critical look at the evidence. Perception & Psychophysics, 14, 255-260.

Gauthier, I. & Tarr, M. J. (in press). Becoming a "Greeble" expert: Exploring mechanisms for face recognition.

Gluck, M. A., & Bower, G. H. (1988). Evaluating an adaptive network model of human learning. Journal of Memory and Language, 27, 166-195.

Goldstone, R. L. (1994). influences of categorization on perceptual discrimination. Journal of Experimental Psychology: General, 123, 178-200.

Goldstone, R. L., & Schyns, P. (1994). Learning new features of representation. Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society. (pp. 974-978). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Gumperz, J. J., & Levinson, S. C. (1991). Rethinking linguistic relativity. Current Anthropology, 32, 613-623.

Harnad, S. (1987). Categorical perception. Cambridge University Press: Cambridge.

Harnad, S., Hanson, S. J., & Lubin, J. (1994). Learned categorical perception in neural nets: Implications for symbol grounding. in V. Honavar & L. Uhr (Eds.) Artificial intelligence and neural networks: Steps toward principled integration. Academic Press: Boston. (pp 191-206).

Hayes-Roth, B., & Hayes-Roth, F. (1977). Concept learning and the recognition and classification of exemplars. Journal of Verbal Learning and Verbal Behavior, 16, 321-338.

Hebb, D. O. (1949). The organization of behavior. New York: Wiley.

Hubel, D. H., & Wiesel (1968). Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology, 195, 215-243.

Kurtz, K. J. (1996). Category-based similarity. In G. W. Cottrell (Ed.) Proceedings of the Eighteenth Annual Conference of the Cognitive Science Society, 290.

Julesz, B. (1981). Textons, the elements of texture perception, and their interaction. Nature, 290, 91-97.

Kruschke, J. K. (1992). ALCOVE: An exemplar-based connectionist model of category learning. Psychological Review, 99, 22-44.

LaBerge, D. (1973). Attention and the measurement of perceptual learning. Memory & Cognition, 1, 268-276.

LaBerge, D., & Brown, V. (1989). Theory of attentional operations in shape identification. Psychological Review, 96, 101-124.

LaBerge, D., & Samuels, S. J. (1974). Toward a theory of automatic information processing in reading. Cognitive Psychology, 6, 293-323.

Lane, H. (1965). The motor theory of speech perception: A critical review. Psychological Review, 72, 275-309.

Logan, G. D. (1988). Toward an instance theory of automatization. Psychological Review, 95, 492-527.

Logothetis, N. K., Pauls, J., & Poggio, T. (1995). Shape representation in the inferior temporal cortex of monkeys. Current Biology, 5, 552-563.

Mordkoff, J. T., Egeth, H. E. (1993). Response time and accuracy revisited: Converging support for the interactive race model. Journal of Experimental Psychology: Human Perception and Performance, 19, 981-991.

Nahinsky, I. D., Slaymaker, F. L., Aamiry, A., & O'Brien, C. J. (1973). The concreteness of attributes in concept learning strategies. Memory & Cognition, 1, 307-318.

O'hara, W. (1980). Evidence in support of word unitization. Perception and Psychophysics, 27, 390-402.

Palmer, S. E. (1978). Structural aspects of visual similarity. Memory & Cognition, 6, 91-97.

Palmer, S. E. (1992). Common region: A new principle of perceptual grouping. Cognitive Psychology, 24, 436-447.

Palmer, S. E., & Rock, I. (1994). Rethinking perceptual organization: The role of uniform connectedness. Psychonomic Bulletin & Review

Perett, D. I., & Oram, M. W. (1993). Neurphysiology of shape processing. Image and Vision Computing, 11, 317-333.

Pevtzow, R., & Goldstone, R. L. (1994). Categorization and the parsing of objects. Proceedings of the Sixteenth Annual Conference of the Cognitive Science Society. (pp. 717-722). Hillsdale, New Jersey: Lawrence Erlbaum Associates.

Saiki, J., & Hummel, J. E. (in press). Connectedness and part-relation conjunctions in object category learning.

Schneider, W., & Shiffrin, R. M. (1977). Controlled and automatic human information processing: 1. Detecting, search, and attention. Psychological Review, 84, 1-66.

Schyns, P., Goldstone, R. L., & Thibaut, J-P. (1995). The development of features in object concepts. Indiana University Technical Report #106. Bloomington, Indiana.

Schyns, P. G., & Murphy, G. L. (1994). The ontogeny of part representation in object concepts. In Medin (Ed.). The Psychology of Learning and Motivation, 31, 305-354. Academic Press: San Diego, CA.

Schyns, P. G., & Murphy, G. L., (1993). The ontogeny of transformable part representations in object concepts. In Proceedings of the Fourteenth Annual Conference of the Cognitive Science Society, (pp. 197-202). Hillsdale, NJ: Erlbaum.

Shepp, B. E., & Barrett, S. E. (1991). The development of perceived structure and attention: Evidence from divided and selective attention tasks. Journal of Experimental Child Psychology, 51, 434-458.

Sheu, C-F, & Ratcliff, R. (1995). The application of fourier deconvolution to reaction time data: A cautionary note. Psychological Bulletin, 118, 234-251.

Shiffrin, R. M., & Lightfoot, N. (in press). Perceptual learning of alphanumeric-like characters. In R. L. Goldstone, P. Schyns, & D. Medin (Eds.) Mechanisms of Perceptual Learning. New York: Academic Press.

Smith, E. E., & Haviland, S. E. (1972). Why words are perceived more accurately than nonwords: Inference versus unitization. Journal of Experimental Psychology, 92, 59-64.

Smith, P. L. (1990). Obtaining meaningful results from Fourier deconvolution of reaction time data. Psychological Bulletin, 108, 533-550.

Tanaka, J. W., & Farah, M. J. (1993). Parts and wholes in face recognition. Quarterly Journal of Experimental Psychology, 46A, 225-245.

Tanaka, J., & Gauthier, I. (in press). Expertise in object and face recognition. In R. L. Goldstone, P. Schyns, & D. Medin (Eds.) Mechanisms of Perceptual Learning. New York: Academic Press.

Townsend, J. T. (1990). Serial vs. parallel processing: Sometimes they look like Tweedledum and Tweedledee but they can (and should) be distinguished. Psychological Science, 1, 46-54.

Townsend, J. T., Hu, G. G., & Kadlec, H. (1988). Feature sensitivity, bias, and interdependencies as a function of energy and payoffs. Perception and Psychophysics, 43, 575-591.

Treisman, A., & Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology, 12, 97-136.

Wang, Q., Cavanagh, P., & Green, M. (1994). Familiarity and pop-out in visual search. Perception & Psychophysics, 56, 495-500.

Whorf, B. L. (1941). Languages and logic. in J. B. Carroll (ed.) Language, Thought, and Reality: Selected papers of Benjamin Lee Whorf. MIT Press (1956), Cambridge, Mass. (pp. 233-245).

Yin, R. K. (1969). Looking at upside-down faces. Journal of Experimental Psychology, 81, 141-145.

Author Notes

Many useful comments and suggestions were provided by John Kruschke, Douglas Medin, Robert Nosofsky, Roger Ratcliff, Philippe Schyns, Richard Shiffrin, Linda Smith, and James Townsend. Special thanks go to Ching-fan Sheu for providing code for Fourier deconvolution of response time distributions. I thank Chris Howard, Nicole Turcotte, Amy Hess, and Dan Manco for their assistance in conducting the experiments. This research was funded by National Science Foundation Grant SBR-9409232. Correspondences concerning this article should be addressed to rgoldsto@indiana.edu or Robert Goldstone, Psychology Department, Indiana University, Bloomington, Indiana 47405. The author can be reached by electronic mail at rgoldsto@indiana.edu, and further information about the laboratory can be found at http://cognitrn.psych.indiana.edu/.

Figure Captions

Figure 1. Stimuli used in Experiment 1. Each letter represents a particular stimulus segment, and each stimulus is composed of five segments. To categorize the item represented by "ABCDE" as belonging to Category 1, it is necessary to process information associated with each of the segments.

Figure 2. In the Random condition, any ordering of the same components counts as the same object. Thus, when object "ABCDE" is presented, it can be presented in 120 different ways, 3 of which are shown above. In the Ordered condition, the spatial positions of the five segments are fixed.

Figure 3. Results from Experiment 1. The most pronounced practice effects were observed for the conjunctive All task with constantly ordered components.

Figure 4. Contiguous and Separated stimuli from Experiment 2.

Figure 5. Results from Experiment 2. Although both conjunctive All tasks show large practice effects, the practice effects are slightly larger for the Contiguous displays.

Figure 6. Stimuli from the Together and Apart conditions of Experiment 3. In both conditions, two segments were relevant for categorization. In the Together conditions, these segments were adjacent. In the Apart condition, they were separated by a third, nondiagnostic segment.

Figure 7. Results from Experiment 3. Larger and more protracted practice effects were observed in the Apart than in the Together condition.

Figure 8. Results from Experiment 4. With the exception of the 4-way and 5-way conjunctions, practice effects were larger and more protracted as more components were required for the conjunctive categorization.

Figure 9. Results from Experiment 5, for each of the four participants. As with earlier experiments, the All task shows a much greater practice effect than does the One task. Practice effects in the All task were observed throughout the 16 hours of training.

Figure 10. The cumulative response time distributions for the four participants taken from the last session. The One and All distributions were empirically obtained. The One5 distribution is obtained by raising each point along the One distribution to the fifth power, and represents the analytic model's predicted cumulative distribution for the All task. Violations of this analytic model occur when the All task distribution is shifted to the left of the analytic model's distribution. Such violations occur for the fastest half of response times for all four participants.

Figure 11. A demonstration of the Fourier transformation method of response time decomposition for C.H.'s data. The One task distribution is taken from Experiment 5A, and the simple detection task distribution is taken from Experiment 5B. By deconvolving the latter from the former, the distribution labeled "deconvolution" is obtained. This represents the distribution of times required for the comparison process of the One task, removing processes associated with the simple detection task. This decomposition of the One task allows us to sample five times selectively from the comparison processes, and combine (convolve) these times with those associated with simple input and output processes.

Figure 12. The newly derived analytic model's predictions are superimposed with the participants' performances on the last experimental sessions. Despite relaxing the assumption of independent sampling, violations of the analytic model are still found for three of the four participants.