Visual Perception Theory
by Saul McLeod published 2007
In order to receive information from the environment we are equipped with sense organs e.g. eye, ear, nose. Each sense organ is part of a sensory system which receives sensory inputs and transmits sensory information to the brain.
A particular problem for psychologists is to explain the process by which the physical energy received by sense organs forms the basis of perceptual experience. Sensory inputs are somehow converted into perceptions of desks and computers, flowers and buildings, cars and planes; into sights, sounds, smells, taste and touch experiences.
A major theoretical issue on which psychologists are divided is the extent to which perception relies directly on the information present in the stimulus. Some argue that perceptual processes are not direct, but depend on the perceiver's expectations and previous knowledge as well as the information available in the stimulus itself.
This controversy is discussed with respect to Gibson (1966) who has proposed a direct theory of perception which is a 'bottom-up' theory, and Gregory (1970) who has proposed a constructivist (indirect) theory of perception which is a 'top-down' theory.
Psychologists distinguish between two types of processes in perception: bottom-up processing and top-down processing.
Bottom-up processing is also known as data-driven processing, because perception begins with the stimulus itself. Processing is carried out in one direction from the retina to the visual cortex, with each successive stage in the visual pathway carrying out ever more complex analysis of the input.
Top-down processing refers to the use of contextual information in pattern recognition. For example, understanding difficult handwriting is easier when reading complete sentences than when reading single and isolated words. This is because the meaning of the surrounding words provide a context to aid understanding.
Gregory (1970) and Top Down Processing
Psychologist Richard Gregory argued that perception is a constructive process which relies on top-down processing. For Gregory (1970) perception is a hypothesis.
For Gregory, perception involves making inferences about what we see and trying to make a best guess. Prior knowledge and past experience, he argued, are crucial in perception.
When we look at something, we develop a perceptual hypothesis, which is based on prior knowledge. The hypotheses we develop are nearly always correct. However, on rare occasions, perceptual hypotheses can be disconfirmed by the data we perceive.
A lot of information reaches the eye, but much is lost by the time it reaches the brain (Gregory estimates about 90% is lost).
Therefore, the brain has to guess what a person sees based on past experiences. We actively construct our perception of reality.
Richard Gregory proposed that perception involves a lot of hypothesis testing to make sense of the information presented to the sense organs.
Our perceptions of the world are hypotheses based on past experiences and stored information.
Sensory receptors receive information from the environment, which is then combined with previously stored information about the world which we have built up as a result of experience.
The formation of incorrect hypotheses will lead to errors of perception (e.g. visual illusions like the Necker cube).
Evidence to Support Gregory's Theory
1. 'Highly unlikely objects tend to be mistaken for likely objects'.
Gregory has demonstrated this with a hollow mask of a face (see above video). Such a mask is generally seen as normal, even when one knows and feels the real mask.
There seems to be an overwhelming need to reconstruct the face, similar to Helmholtz's description of 'unconscious inference'. An assumption based on past experience.
2. 'Perceptions can be ambiguous'
The Necker cube is a good example of this. When you stare at the crosses on the cube the orientation can suddenly change, or 'flip'.It becomes unstable and a single physical pattern can produce two perceptions.
Gregory argued that this object appears to flip between orientations because the brain develops two equally plausible hypotheses and is unable to decide between them.
When the perception changes though there is no change of the sensory input, the change of appearance cannot be due to bottom-up processing. It must be set downwards by the prevailing perceptual hypothesis of what is near and what is far.
3. 'Perception allows behavior to be generally appropriate to non-sensed object characteristics'.
For example, we respond to certain objects as though they are doors even though we can only see a long narrow rectangle as the door is ajar.
What we have seen so far would seem to confirm that indeed we do interpret the information that we receive, in other words, perception is a top down process.
Critical Evaluation of Gregory's Theory
1. The Nature of Perceptual Hypotheses
If perceptions make use of hypothesis testing the question can be asked 'what kind of hypotheses are they?' Scientists modify a hypothesis according to the support they find for it so are we as perceivers also able to modify our hypotheses? In some cases it would seem the answer is yes. For example, look at the figure below:
This probably looks like a random arrangement of black shapes. In fact there is a hidden face in there, can you see it? The face is looking straight ahead and is in the top half of the picture in the center. Now can you see it? The figure is strongly lit from the side and has long hair and a beard.
Once the face is discovered, very rapid perceptual learning takes place and the ambiguous picture now obviously contains a face each time we look at it. We have learned to perceive the stimulus in a different way.
Although in some cases, as in the ambiguous face picture, there is a direct relationship between modifying hypotheses and perception, in other cases this is not so evident. For example, illusions persist even when we have full knowledge of them (e.g. the inverted face, Gregory 1974). One would expect that the knowledge we have learned (from, say, touching the face and confirming that it is not 'normal') would modify our hypotheses in an adaptive manner. The current hypothesis testing theories cannot explain this lack of a relationship between learning and perception.
2. Perceptual Development
A perplexing question for the constructivists who propose perception is essentially top-down in nature is 'how can the neonate ever perceive?' If we all have to construct our own worlds based on past experiences why are our perceptions so similar, even across cultures? Relying on individual constructs for making sense of the world makes perception a very individual and chancy process.
The constructivist approach stresses the role of knowledge in perception and therefore is against the nativist approach to perceptual development. However, a substantial body of evidence has been accrued favoring the nativist approach, for example: Newborn infants show shape constancy (Slater & Morison, 1985); they prefer their mother's voice to other voices (De Casper & Fifer, 1980); and it has been established that they prefer normal features to scrambled features as early as 5 minutes after birth.
3. Sensory Evidence
Perhaps the major criticism of the constructivists is that they have underestimated the richness of sensory evidence available to perceivers in the real world (as opposed to the laboratory where much of the constructivists' evidence has come from).
Constructivists like Gregory frequently use the example of size constancy to support their explanations. That is, we correctly perceive the size of an object even though the retinal image of an object shrinks as the object recedes. They propose that sensory evidence from other sources must be available for us to be able to do this.
However, in the real world, retinal images are rarely seen in isolation (as is possible in the laboratory). There is a rich array of sensory information including other objects, background, the distant horizon and movement. This rich source of sensory information is important to the second approach to explaining perception that we will examine, namely the direct approach to perception as proposed by Gibson.
Gibson (1966) and Bottom Up Processing
Gibson argued strongly against the idea that perception involves top-down processing and criticizes Gregory’s discussion of visual illusions on the grounds that they are artificial examples and not images found in our normal visual environments. This is crucial because Gregory accepts that misperceptions are the exception rather than the norm. Illusions may be interesting phenomena, but they might not be that informative about the debate.
James Gibson (1966) argues that perception is direct, and not subject to hypotheses testing as Gregory proposed. There is enough information in our environment to make sense of the world in a direct way. For Gibson: sensation is perception: what you see if what you get. There is no need for processing (interpretation) as the information we receive about size, shape and distance etc. is sufficiently detailed for us to interact directly with the environment.
For example, support of the argument that perception is direct is motion parallax. As we move through our environment, objects which are close to us pass us by faster than those further away. The relative speed of these objects indicates their distance away from us. This is evident when we are travelling on a fast moving train.
Gibson (1972) argued that perception is a bottom-up process, which means that sensory information is analyzed in one direction: from simple analysis of raw sensory data to ever increasing complexity of analysis through the visual system. Gibson attempted to give pilots training in depth perception during the Second World War, and this work led him to the view that our perception of surfaces was more important than depth/space perception. Surfaces contain features sufficient to distinguish different objects from each other. In addition, perception involves identifying the function of the object: whether it can be thrown or grasped, or whether it can be sat on, and so on.
Gibson claimed that perception is, in an important sense, direct. He worked during World War II on problems of pilot selection and testing and came to realize: In his early work on aviation he discovered what he called 'optic flow patterns'. When pilots approach a landing strip the point towards which the pilot is moving appears motionless, with the rest of the visual environment apparently moving away from that point.
The outflow of the optic array in a landing glide.
According to Gibson such optic flow patterns can provide pilots with unambiguous information about their direction, speed and altitude.
Three important components of Gibson's Theory are
1. Optic Flow Patterns;
2. Invariant Features; and
These are now discussed.
1. Light and the Environment - Optic Flow Patterns
Changes in the flow of the optic array contain important information about what type of movement is taking place. For example:
i ) Any flow in the optic array means that the perceiver is moving, if there is no flow the perceiver is static.
ii) The flow of the optic array will either be coming from a particular point or moving towards one. The center of that movement indicates the direction in which the perceiver is moving. If a flow seems to be coming out from a particular point, this means the perceiver is moving towards that point; but if the flow seems to be moving towards that point, then the perceiver is moving away. See above for moving towards an object, below is moving away:
The Optic Flow pattern for a person looking out of the back of a train.
2. The Role of Invariants in Perception
We rarely see a static view of an object or scene. When we move our head and eyes or walk around our environment, things move in and out of our viewing fields. Textures expand as you approach an object and contract as you move away.
There is a pattern or structure available in such texture gradients which provides a source of information about the environment. This flow of texture is INVARIANT, i.e. it always occurs in the same way as we move around our environment and, according to Gibson, is an important direct cue to depth. Two good examples of invariants are texture and linear perspective.
Are, in short, cues in the environment that aid perception. Important cues in the environment include:
OPTICAL ARRAY: The patterns of light that reach the eye from the environment.
RELATIVE BRIGHTNESS: Objects with brighter, clearer images are perceived as closer
TEXTURE GRADIENT: The grain of texture gets smaller as the object recedes. Gives the impression of surfaces receding into the distance.
RELATIVE SIZE: When an object moves further away from the eye the image gets smaller. Objects with smaller images are seen as more distant.
SUPERIMPOSITION: If the image of one object blocks the image of another, the first object is seen as closer.
HEIGHT IN THE VISUAL FIELD: Objects further away are generally higher in the visual field
Evaluation of Gibson's (1966) Direct Theory of PerceptionVisual Illusions
Gibson's emphasis on DIRECT perception provides an explanation for the (generally) fast and accurate perception of the environment. However, his theory cannot explain why perceptions are sometimes inaccurate,
e.g. in illusions. He claimed the illusions used in experimental work constituted extremely artificial perceptual situations unlikely to be encountered in the real world, however this dismissal cannot realistically be applied to all illusions.
For example, Gibson's theory cannot account for perceptual errors like the general tendency for people to overestimate vertical extents relative to horizontal ones.
Neither can Gibson's theory explain naturally occurring illusions. For example if you stare for some time at a waterfall and then transfer your gaze to a stationary object, the object appears to move in the opposite direction .
Bottom-up or Top-down Processing?
Neither direct nor constructivist theories of perception seem capable of explaining all perception all of the time. Gibson's theory appears to be based on perceivers operating under ideal viewing conditions, where stimulus information is plentiful and is available for a suitable length of time. Constructivist theories, like Gregory's, have typically involved viewing under less than ideal conditions.
Research by Tulving et al manipulated both the clarity of the stimulus input and the impact of the perceptual context in a word identification task. As clarity of the stimulus (through exposure duration) and the amount of context increased, so did the likelihood of correct identification. However, as the exposure duration increased, so the impact of context was reduced, suggesting that if stimulus information is high, then the need to use other sources of information is reduced. One theory that explains how top-down and bottom-up processes may be seen as interacting with each other to produce the best interpretation of the stimulus was proposed by Neisser (1976) - known as the 'Perceptual Cycle'.
DeCasper, A. J., & Fifer, W. P. (1980). Of human bonding: Newborns prefer their mothers' voices. Science, 208(4448), 1174-1176.
Gibson J. J. (1966). The Senses Considered as Perceptual Systems. Boston: Houghton Mifflin.
Gibson, J. J. (1972). A Theory of Direct Visual Perception. In J. Royce, W. Rozenboom (Eds.). The Psychology of Knowing. New York: Gordon & Breach.
Gregory, R. (1970). The Intelligent Eye. London: Weidenfeld and Nicolson.
Gregory, R. (1974). Concepts and Mechanisms of Perception. London: Duckworth.
Slater, A., Morison, V., Somers, M., Mattock, A., Brown, E., & Taylor, D. (1990). Newborn and older infants' perception of partly occluded objects. Infant Behavior and Development, 13(1), 33-49.
Attentional Blindness Video
How to cite this article:
McLeod, S. A. (2007). . Retrieved from
Like This Article? Please Share!
Like The Site? Follow Us!