Pattern Recognition - perception - identifying stimuli - constructing the meaning of a sensory experience - naming; basic processes - analysis, comparison, and decision; matching information extracted from the stimulus to information stored in LTM; analysis, comparison, and decision
Theories of Pattern Recognition (objects)
 Gestalt - the whole is greater than the sum of its parts; principles of perceptual organization

 Template theory - sensory information is compared directly to miniature copies stored in LTM; inflexible and uneconomical; pre-comparison processor (clean-up)

 Prototype theory -  prototypes are abstract forms representing the basic or most important elements of a set of stimuli; each stimulus is a member of a class of stimuli and shares key attributes of that class.

 Prototypes are formed on the basis of frequently experienced features; any stimulus can be encoded as a prototype and a list of variations and incoming stimuli are compared to these prototypes.
 Franks & Bransford (1971) showed Ss geometric type forms which could be formed into structured groupings; recognition test - Ss recognized prototypes even though they had never seen them.
 Posner, Goldsmith, & Melton (1967)  Ss saw patterns of dots; at test Ss saw old distortions, new distortions, and prototypes - they falsely recognized the never before seen prototypes.
 Rosch's work on concept formation:
  categories  are based on prototypes - we decide if an object is a member of a category by comparing it to a prototype or best example of the category (fuzzy sets - imprecise)
  object naming occurs at levels in a hierarchy (superordinate, basic, and subordinate); basic level names reflect prototypes (objects can be classified at many levels but at the basic level the category carries the most information and categories are most different from each other)

  characteristics of the basic level: 1) members have attributes in common; 2) have motor movements in common; 3) shapes in common; 4) names are used to identify objects

  prototypes are: 1) learned quickly; 2) serve as reference points; 3) have common attributes in a family resemblance; 4) are supplied as examples of the category; 5) are judged more quickly after priming
 

 Exemplars (Smith & Medin, 1981) - one remembers separate instances (exemplars) rather than an averaged concept abstracted from experience - several exemplars for a category and we categorize an object by seeing whether it resembles a lot of remembered exemplars from a single category - a reliance on concrete examples. People know a lot about the possible variation of members within the category; accounts for correlations of attributions within a category ( a single summary prototype doesn't; e.g. cash registers and formica tables in certain restaurants). Easy to modify existing categories with new instances.

 Feature theory - a pattern is a configuration of elements (features) that belong together; pattern recognition begins with the extraction of features from the stimulus; analysis-by-synthesis (Halle & Stevens - feature recognition for speech) - several sources of information combine in our analysis of incoming speech sounds;  Listeners construct internal copies of the sounds that they think they are hearing and then compare the copies to the input (narrowing down procedure)
 LTM is a set of rules used to construct an internal pattern to compare to the input.

Hubel & Wiesel (1965, 1979) - single cell recordings in the visual cortex of cats - specific retinal regions responded to specific features

 Neisser - scan lists; feature similarity between target and list was varied
 Gibson (1969) identify letters on the basis of feature analysis (12)
 Gibson, Shapiro, & Yonas (1968) length of time to say that two letters were the same or different varied with feature similarity
 Lupker (1979) rapidly presented letters were confused on the basis of their features and their overall shape
 Pritchard (1961) nystagmus and stabilized retinal images - the stabilized image disappears feature-by-feature
 Healy (1981) mark spelling errors in passage; errors had been made using letter substitutions varying the similarity to the proper letter (e/o vs. e/l)

 

Recognition by Components  ( or Structural Model )(based on features) the relationship among features is important; some features are more important to the overall pattern than others

 Beiderman (1985) showed objects with parts (features) deleted, the deletions occurring at either the vertices or in middle segments (presented for 40msec); with the middle segments deleted - 70% accuracy; with the vertices deleted - less than 50% accuracy
 Rhodes, Brennan, & Carey (1987) faster at identifying caricature drawings than accurate drawings

(Biederman, 1987/1991) Network theory of pattern recognition.

Geons - the building blocks of all objects (cylinders, cones, triangles) are combined in many ways (on top of, to the side, etc.).
Hierarchy of detectors:
a) feature detectors - lowest; respond to curves, edges, etc.
b) geon detectors - activated by feature detectors
c) higher level detectors - recognize combinations of features and geons

Biederman & Cooper (1991) found that pictures sharing the same geons prime each other
Beiderman (1995) described how geons can vary in their orientation to other geons, and in the ratio between their length and width 

viewer-centered approach - we store a small number of views (orientations) of 3-D objects so that we can mentally rotate an object to match a view (Dickinson, 1999; Tarr & Vuong, 2002). Because this rotation takes time and sometimes results in errors, it can be tested.

 

Context and Pattern Recognition
How does pattern recognition happen so quickly? (serial or parallel). Is PR stimulus driven or conceptually driven (bottom-up; top-down processing)?
 Neisser (1964) visual search task; lines of letters - start at the top and find the target letter(s); press a RT button; regardless of how many target letters, Ss responded about as quickly as they did with one letter

 Context effects: Navon (1977) (Palmer, 1975) if an object is presented in an appropriate context, it is recognized faster

 Weisstein & Harris (1974) detect a line embedded in a coherent or in a less coherent form - object superiority effect (What would feature theory predict?)  Warren & Warren (1970) a cough placed in an auditorily presented sentence; depending on the sentential context the word was perceived differently

 

Perceptual Cycle Model of PR (Conceptually driven and data driven processing) Neisser - the perceptual cycle combines sensory information (data) and hypotheses (LTM knowledge) - Activity cycles back and forth between raw data and anticipatory schema ( states of readiness for certain kinds of information to occur). Context limits the number of possibilities for a pattern; context serves as a guide to recognition . We do not perceive just the whole or just the parts.

change blindness - the inability to detect changes in a scene or in an object; Simons & Levin, 1997a) and Levin (1997b, 1998) the stranger and the door study

inattentional blindness - when focused on something, we often fail to notice something new; truckers failing to see a motorcycle; when is there a gorilla playing basketball? (Simons & Chabris, 1999)

Pattern Recognition and Word Recognition

 
 Perceptual cycle in reading (how rapidly are words vs. letters identified?)
 Huey (1908) letters and familiar words were identified in the same amount of time; words are often identifiable when individual letters are unidentifiable
 Reicher (1969) Word Superiority Effect; demonstrated the perceptual advantage of words over nonwords; we identify a letter faster when it is embedded in a word
 Solman, May & Schwartz (1981) word superiority may have two sources - sensory memory where some elements are retained longer or word knowledge(LTM) is being used to guess; used high (WISH) or low constraint (RISE) words in degraded form or nonwords (IERS); performance was the same for high and low constraint words
 Pollatsek & Rayner (1989) parallel letter model- letters are activated in parallel and the activations are sent to parallel word detectors. Assumptions: visual perception involves parallel processing of all letters at time 1; recognition occurs simultaneously a 3 different levels of abstraction (feature/ word/ letter) and these levels interact to determine what we perceive

Word in a Sentence Effect (Ruekle & Oden, 1986) both the' features of the stimulus and the nature of the context influence word recognition"(Matlin, 2004, p.47). Letter and letter-like stimuli were used (a perfectly formed r , a perfectly formed n, and 3 symbols that were intermediate between r and n). The stimulus letters were embedded in the word bea_s (testing bottom-up processing). The sentence in which the word appeared was also varied; "The _____raised to supplement his income."preceded by either liontamer or zookeeper, botanist or dairy farmer. People were more likely to chose the bears response when the sentence was preceded by liontamer or zookeeper and beans when preceded by botanist or farmer (top-down).

Feature Net

a network of detectors, organized in layers, with each subsequent layer concerned with more complex, larger scaled patterns  (features on up). Some neurons are more easily activated than others. At any point in time, each detector has a particular activation level and when it receives some input this increases its activation level - eventually high enough activation to cause the detector to fire or send its signal (response threshold). Baseline activation level - each detector’s activation level prior to any input. (recently fired detectors and frequently fired detectors will have higher baseline act.level.

Feature net and well-formedness, error trapping, ambiguous inputs, and recognition errors:
Not just any context ‘GLAKE’ versus ‘JPWKS’ - a well formed context according to natural language rules; the more regular the context the greater the facilitation; tend to misread less common patterns more often than common patterns (thorough vs. through vs. though). TPUM misread TRUM or DRUM but DRUM is unlikely to be misread as TRUM; we bring words into line with normal spelling rules and usually do not realize an error has been made (Potter et al, 1993)

The efficiency of recognizing regular letter strings - bigram detectors - detectors for 2 letter combinations - well-formed words involve familiar letter combinations; bigram detectors are affected by frequency and recency.
Error trapping - procedures that detect and correct errors before they cause greater confusion
Ambiguous inputs - a choice is made at the bigram level  C /-\ T vs. T/-\E - AT is a common bigram
Recognition errors - over-regularization - if the input was a frequent word, the network facilitates perception; if the input is an infrequent word, the network is still biased toward the frequent and often will err - CQRN is perceived as CORN; LOCI is seen as LOCK; LORN viewed as BORN or CORN or TORN. The input is not misidentified, its misperceived (through and thorough)
 
 

Parallel Distributed Processing

The system is better prepared or expects certain input - a passive expectation, built into the activation levels/ not explicitly stored knowledge - the knowledge is apparent in how the various elements of the net function together. The knowledge is distributed across the network (how various detectors work together). Activation occurs locally, influenced by the feeding detectors. When they work in conjunction with each other (in parallel) the result is a process that knows the rules.

Skilled readers read less than poorer readers - they fill in the gaps, infer more; make more effective use of what they see on the page

Proofreading - hard to do on something that you wrote yourself - too well primed - make too many over-regularization errors

McClelland & Rumelhart Model (1981)
feature-based network of connections; detectors excite and inhibit
 

All sensory information is processed in parallel (you process the whole word or the whole face); parts of the object are processed along with the whole (you can detect the eyes or a single letter or feature that activates the whole unit or word). The processing of an individual's face influences your processing of her/his eyes (the whole is the context).
 The brain is so slow that it is necessary to process things in parallel or nothing would get accomplished.
 Knowledge is in the connections not in units connected. Connections between units can be excitatory or inhibitory. Recognition is the sum of activity in the network.

back to home page