New Chromaticity Diagram

INTERPRETATION, PERCEPTION AND COGNITION IN VISION



PROCESSES IN BIOLOGICAL VISION

by JAMES T. FULTON


Updated: July 2009           To abstract or reference this material, see Citation Page

This is a summary page relying upon a large amount of other material. The webpages associated with the overall VISION PROCESS, the facility for READING and the development of the underlying mechanism of the NEURON have provided the groundwork for a detailed development of the MECHANISM employed in INTERPRETATION and PERCEPTION leading to COGNITION via the visual system. A more detailed foundation and discussion appears in the text PROCESSES IN BIOLOGICAL VISION.

The above documentation has led to a large compendium of the properties of the human VISUAL SYSTEM. However, that document has not developed the mechanism of perception and interpretation. That is the subject of this webpage. To follow the development, the reader should be familiar with the above references. This page is subdivided into:

  1. POSTULATES DRAWN FROM EARLIER DISCUSSIONS.
  2. THE APPLICABILITY OF THE ANALYTICAL PATH.
  3. THE ANALOG NATURE OF THE ANALYTICAL SIGNALING REGIME
  4. THE ORGANIZATION OF THE ANALYTICAL SIGNAL CHANNEL
  5. THE INFORMATION BANDWIDTH OF THE ANALYTICAL PATH


  6. THE DEFINITION OF PERCEPTION, INTERPRETATION and RECOGNITION

POSTULATES DRAWN FROM THE EARLIER DISCUSSIONS

This work has highlighted many features of the visual system that have not been definitized in the scientific literature. These features have presented a number of paradigm shifts in the proposed operating methodology of the system. Some of the shifts have resulted in major extensions of our understanding of how the system operates. This is particularly true in the area of the mechanism of perception and interpretation. The literature does not contain any other concrete discussions or proposals related to this mechanism.

The following discussion will rely on many architectural features, processes, mechanisms and facilities developed earlier. These have been tabulated in a section on MAJOR THEMES of this work. They include:

  1. The eye employs an immersed optical system.
  2. The eye is fundamentally a change detector, not an imager.
  3. The neuron is an electrolytic semiconductor device.
  4. The vast majority of the neurons in a given animal are operated in the analog mode.
  5. Light adaptation occurs prior to any other signal processing.
  6. The signaling within the visual system involves at least four distinct stages and four distinct signaling paths.
  7. The signal processing within the brain is concentrated in a large number of individual processing engines.
  8. The bandwidth of the signal paths within an engine are very high.
  9. The signaling bandwidth between engines is relatively low.

THE APPLICABILITY OF THE ANALYTICAL PATH

The visual system is normally asked to process three fundamental types of images.

All visual systems are designed to accommodate the first two forms of images. Only the human species is known to be able to accommodate the last, particularly when the information is at a fine level of detail.

All initial images are processed by the awareness signal channel that relies upon the path from the retina to the so-called primary visual cortex via the lateral geniculate nuclei, LGN. Part of the awareness channel is used in an alarm mode to notify the pretectum of the analytical channel of threats to the subject in the environment. The major analytical channel, as well as the volition channel, of vision involve the pretectum and the analytical visual cortex rather than the LGN and primary visual cortex.

The limited angular size of the foveola (1.2 degrees in object space, only 0.9 degrees in image space measured from the exit pupil of the lens group) constrains the analytical processing channel. For scene material falling outside of this cone, multiple saccades are required to perceive and interprete the entire object of attention. Scene components external to this cone can still be perceived at an awareness level, but they cannot be fully perceived and interpreted without at least one major saccade.

Only about three or four characters of ten point type are included within the boundary of the foveola when viewed at fifteen inches (a nominal height of 30 arc minutes in object space). These characters constitute a character group in terms of perception and interpretation. Such groups typically form individual words or syllables of longer words.

The differences in the methodology of how pictographs and symbolic text are perceived and interpreted are sufficiently different as to require individual attention at the detail level. These differences primarily relate to their effective utilization of the area of the foveola and the arrangements of the oculomotor muscles. However, at the mechanistic level associated with perception and interpretation, there is no significant difference.

THE ANALOG NATURE OF THE ANALYTICAL SIGNALING REGIME

While the neural system has historically been considered to operate almost exclusively in the phasic signaling mode. This has been due to inadequate instrumentation. The vast majority of the neurons in the system are operated in the analog mode as components of the signal detection,signal processing and information cognition stages of the system. Only the interconnections via the signal projection stages employ action potentials in the phasic mode for their operation. Less than one perecent of the neurons of a given animal system operate in the phasic or tonic mode. All of the neurons of the retina before the ganglion cells operate in the analog mode, including the photoreceptor cells. Similarly, all of the signal processing neurons of the midbrain, cerebellum and cerebrum operate in the analog mode. Only the interconnection, or association, signal paths operate in the phasic mode.

The analog signal processing channels are readily capable of signal summation, differencing, correlation, thresholding and some compression. They are not generally capable of multiplication (except at the output of the photoreceptor cells).

THE ORGANIZATION OF THE ANALYTICAL SIGNAL CHANNEL

The analytical signal path extends from the foveola of the retina to area 7 of the analytical visual cortex by way of the thalamus of the midbrain and in cooperation with the cerebelum. The major elements of the thalamus associated with the analytical channel include the thalamic reticular nucleus (TRN), the perigeniculate nucleus (PGN) and the pulvinar. The TRN acts as the master control for all visual functions. The PGN acts as the primary feature extraction engine of the analytical mode of vision. The pulvinar is a major random access memory supporting the interpretation and perception of features from a scene. These paths and circuits are discussed in detail in Chapter 15 and Section 7.5 of PROCESSES IN BIOLOGICAL VISION.

The Critical Role of the Perigeniculate nucleus (Pretectum in lower animals)

The perigeniculate nucleus (PGN)plays several key roles in the analysis task as a key part of the Precison Optical Servomechanism Subsystem (POS).

The role of the pretectum and the cerebellum are crucially important in reducing the volume of the signal information received from (particularly the foveola of) the retina.

The Character of the Information Presented to the Perigeniculate nucleus

The foveola of each human retina contains a nominal 23,000 photoreceptor cells. The special morphology of the fovea suggests that each of these cells is connected directly to the PGN of the midbrain via the signal projection stage. There is little or no signal processing within the retina related to the foveola.

It is important to note that while the signal projection stage employs phasic signaling via action potentials, the signals from the foveola are returned to analog form at, and before processing by, the PGN. Therefore, in the first order, phasic signaling can be ignored when discussing signal processing. All signal processing in the visual system and within the brain occurs in the analog or tonic domain.

The signals delivered to the PGN have been normalized in analog amplitude by the adaptation amplifiers of the photoreceptor cells before they are encoded by the ganglion cells and decoded by the stellate cells upon reaching the PGN.

There are no digital circuits within the neural system as distinguished from phasic circuits.
There are also no circuits designed specifically as binary circuits.
However, there are limiting and correlation circuits that when combined produce an output signal that has been optimized for best signal to noise ratio. Some of these signals exhibit a nearly binary amplitude characteristic but they are not encoded in serial binary form.

The performance of the analog signal processing circuits of the analytical channel are sensitive to the contrast of the image presented to the foveola. Poor image contrast limits the performance of the perception and interpretation mechanisms.

As shown in the webpage displaying the image of an object projected on the foveola,

  1. Only a single object within an overall scene or a single character group is projected onto the foveola at a given instant.
  2. the tremor causes the image to repeatedly, and at least partially, obscure individual photoreceptors.
  3. A single microsaccade, either horizontal or vertical, generates signals related to all of the edges of the object or character group that are perpendicular to the saccade.
Because of the operation of the adaptation amplifiers, this process generates an analog signal at the output of each affected photoreceptor cell even if only a fraction of the area of the cell is obscured. With each horizontal or vertical microsaccades, a large number of parallel signals are created in the individual photoreceptor circuits.

The scanning pattern employed by the POS to perceive and interpret any typical images is not known at the current time.

The goal of the perception and interpretation process is to process this large number of originally spatially correlated, but now also temporally correlated, analog signals into a single vectorial sample in time indicative of the scene.

The creation of a single vectorial sample in time representing a scene, without taking an inordinate amount of time, requires a massive amount of parallel signal processing.

Typically, when reading western languages, only about one-third of the photoreceptors of the foveola are directly involved at one time. The blank space between lines of text relieves the processing load related to the whole foveola. The bolder the type face, the more photoreceptors remain covered throughout a microsaccade and the lower the volume of signals requiring processing within one time interval.

The signal processing involved in perception and interpretation can only be sketched. It includes a large number of individual correlation circuits employing Boolean algebra in their algorithms When perceiving text, the circuits include;

  1. Circuits to determine the longest individual edge and thereby determine the factor required to normalize the size of the character set.
  2. Circuits to determine the location of the median of all finite width strokes in each character.
  3. Circuits to isolate all of the strokes that form a continuous stroke pattern.
  4. Circuits to compare each stroke pattern with a section of the saliency map in order to interpret the meaning of each standalone character (phoneme).
  5. Circuits to compare the sequence of phonemes perceived during a single gaze against a portion of the saliency map in order to interpret the meaning of the character group.
  6. Circuits to prepare a timely vectorial output signal defining the meaning of the imaged portion of the scene or text.

The MASSIVELY PARALLEL nature of the processing

The amount of the processing required to correlate the stroke patterns, phonemes and character groups against the stored saliency map suggest that much of this activity involves the cerebellum as the master lookup table of the brain. This correlation activity takes time. However, the total time delay can be compared to the time delay associated with the signal projection circuits. These delays are typically three milliseconds between the retina and the pretectum and between the pretectum and the analytical visual cortex.

  • As in the other related neural processes, it appears the pretectum coordinates the correlation activity and controls the transmission of the vectorial signal representing the concept imaged on the foveola to the analytical visual cortex.
  • It is not known how many individual neural paths are required to transmit the vectorial output signal to the analytical visual cortex. The output of the interpretation process can be considered to be a parallel encoded sampled data signal stream with the tonic signals on each path exhibiting a binary-like amplitude characteristic under high contrast image input conditions.

    The above sequence highlights the fact that the vectorial signals traveling between the pretectum and the analytical visual cortex are completely uncorrelated in time and image space. They are also encrypted as far as the human investigator is concerned. For very simple images, virtually no information is transmitted over the Pulvinar Pathway. For very complex images, the individual data samples show nearly zero apparent correlation over intervals of more than ten milliseconds. Only if the vectorial code is understood will the actual correlation in time be discernable.

    THE INFORMATION BANDWIDTH OF THE ANALYTICAL PATH

    The goal of the analytical signal path is to accept analog signals generated by an area image projected onto the 23,000 photoreceptor cells of the foveola and to output a single interpretation of their conceptual content in vectorial form following a relatively constant time delay. It appears this time delay should be short relative to the other time delays of the analytical signal path, about three milliseconds for both the optic nerve and the Pulvinar Pathway.

    The Condensation of the Information along the Analytical Signal Path when Reading.

    As seen in the READING pages of this site, the signal from each photoreceptor of the 23,000 foveola cells can output a signal modulated by the tremor to produce a signal with temporal spectral energy in the 30-150 Hz. If all of these signals arrived simultaneously at the pretectum, a significant information volume must be processed (roughly 2.3 million samples per second). Fortunately, much of the image projected on the foveola is blank space, or the interior of solid strokes, at any given time. This reduces the information rate passing over the optic nerve by about an order of magnitude relative to the possible sample rate.

    During processing within the midbrain (and/or cerebellum), each data sample is probably manipulated beteen 50 and 200 times during the various correlation processes. The total internal sample rate within the perception and interpretation engine of the midbrain probably exceeds one to five million samples per second. The capacity of the memory required to support the saliency map associated with this correlation process has not been estimated. However, the end result is the output of the process is probably represented by a single vector containing on the order of five to ten individual components in parallel.

    The Summation of the Signal Processing Capability of the Overall Analytical Signal Path when Reading.

    If one combines the signal processing accomplished within the midbrain with the additional signal processing associated with the analytical visual cortex, the total sample rate is clearly over ten million samples per second. If that processing leads to any cognition using engines of the frontal lobe of the cortex, cumulative sampling rates of over 20-50 million samples per second are easily obtained. These processing rates are required and achieved even though the signal throughput of the analytical channel along each optic nerve is only about one quarter million samples per second.

    References in the literature to total throughput rates for the entire human brain on the order of 10 bits/second appear far fetched by any method of calculation.

    THE DEFINITION OF INTERPRETATION, PERCEPTION and RECOGNITION

    This section supports a number of other pages of this site, including

    INTERPRETATION, PERCEPTION and RECOGNITION need to be carefully defined if they are to be fully differentiated and understood. In this work,

    INTERPRETATION is the function of analyzing the small instantaneous image projected onto the foveola. It is a primary function of the Precision Optical System (POS) The POS has historically been referred to as the auxiliary optical system by anatomists. The actual extraction of the features of the image occurs in the perigeniculate nucleus of the midbrain. The output of the process is a vector that is largely independent of image space and is representative of the graphical features of the object scene. It says "the image contains the following strokes interconnected in the following way < LIST >. The color within each enclosed series of strokes is defined as < LIST >. This initial vector can be described as an initial interp, a minimal piece of information about only the instantaneous image presented to the foveola. This vector is stored in an initial interp map.


    The initial interps from a series of saccades results in a more complex vector that is assembled by a portion of the pulvinar. The pulvinar relies upon its random access memory to perform this initial simplification process. The resulting vector is stored in the complete interp map associated with the scene.


    PERCEPTION is the function of reducing the relatively complex vector signal, the interp, produced by the INTERPRETATION process to a simpler vector representative of the object imaged onto the foveola. This interpretation occurs largely within the midbrain and probably involves the short term memory of the pulvinar and of the cerrebelum. The resulting vector signal is called a percept. It says "it is the face of a woman with this set of auxiliary features." The vector signal is passed to Area 7 of the cortex.

    RECOGNITION is the function of placing a vector received by Area 7 of the cortex from the interpretation facility of the midbrain in proper context by comparing it to previously stored features of the saliency map (in vector space). This is a cognitive process associated with the forebrain (the forward areas of the cerebral cortex). The term "map" is used here to describe a general database of largely unknown content and arrangement used by the cortex and shared with all sensory information [15.2.1 through 15.2.5]. The result of this process is an even simpler vector that says "it is grandma. She is smiling, six feet away and turning to the left." In the absence of recognition based on the historical saliency map, a tentative new entry is made in the saliency map and the POS is generally requested to study the image in more detail and provide more information for perception, interpretation and recognition before adding a permaneent entry to the saliency map.