Semantic Memory - organized knowledge

 

Procedures used to study:

Sentence verification task - measures latency to respond to a sentence ("a canary is a bird"). The idea is that latency reflects organization. This technique has revealed four observations about memory:

1. the category size effect - people respond faster when the item is a member of a small category (poodle is a dog versus a poodle is an animal)

2. the typicality effect (relatedness) people are faster to respond to usual or typical members (poodle vs. bloodhound). "Fuzzy sets" not every member of a category is a good representative of that category (whales).Katz (1981) "a barrel is round" took .3s longer to verify than "a ball is round"

3. the context effect (priming effect) people respond faster to an item preceded by a similar itemMcNamara (1984) cities preceded by nearby cities were recognized faster

4. the true-false effect or the fast-true effect - we answer true items faster than false items (.17s)

Models of Semantic Memory

NETWORK MODELS (TLC - Quillian/ Spreading Activation - Collins & Loftus)

Semantic memory is a netlike organization of concepts in memory with many interconnections. Each concept is represented by a NODE or a location in the network and there are LINKS or associations which connect NODES. There are SUPERORDINATE LINKS and MODIFIER LINKS

Superordinate links show that the concept is a member of a larger class. LTM Structure is based on semantic relatedness rather than hierarchies. Semantic relatedness is reflected in the strength of links connecting concepts.

Modifier links show the properties of the concept

How does it work? ....................SPREADING ACTIVATION

SPREADING ACTIVATION - naming activates a node, the activation expands or spreads to other related nodes. Activation spreads first to all of the nodes linked to the original node and then spreads to more remote nodes. When a node has been activated it is tagged along with information about the nature of the activation. Activation will occur from 2 sources and we get an INTERSECTION (Collins & Loftus, 1975). If the activation is strong enough, it will be attended to. Frequently used links have greater strength and therefore activation travels faster between 2 nodes (typicality). The greater the distance between items, the weaker the spread. The greater the number of irrelevant paths (and interfering information), the weaker the spread.

 

Semantic Distance Effect - RT verification increases with the distance between S and P terms ("a canary is a bird" vs. "An Ostrich is an animal") but how about "A ostrich is an animal" vs. "An Ostrich is a bird"? The typicality effect or relatedness explains the faster RT to the first sentence

 

In some cases, relatedness does not have the usual effect of inhibiting RT for false responses (All fruits are vegetables vs. All fruits are flowers).

Evidence: Meyer & Schvanaveldt (1971) effects of context on sentence verification/ processing one word impacts (facilitates or inhibits) the processing of another word - positive and negative priming (nurse - doctor). Findings: meaning affected speed but physical similarity (nurse - purse) did not

Neely (1977) when the prime was the name of the category and the letter string an instance of the category (automatic spreading activation)

Advantages of this model

1) accounts for semantic relatedness effects; 

2) shows that the meaning of a word is greatly affected by associations between the word and its associated concepts; 

3) assumes that there is more than one way to facilitate verification

Problem: Ratcliff & McKoon (1981) found evidence supporting the notion that the amount of activation arriving at one Node is a decreasing function of the number of links that the activation has traversed but did not find evidence that activation takes a significant amount of time to spread between two nodes. Activation was present at the same moment at all activated nodes and so the time required for activation was not a function of distance from the initially activated node (parallel, simultaneous activation)

 

SEMANTIC FEATURE MODEL SMITH, SHOBEN, & RIPS

A concept is defined as a set of features (like properties or modifiers in the network models). Birds have wings, fly, lay eggs; these are all features of bird. The semantic features of a concept combine to provide its meaning.

A two stage decision process is used to make judgments about concepts:

1) comparing all features of the subject of a sentence and the predicate (A canary is a bird)

may show low similarity > quick false

may show high similarity > quick true

may show intermed. sim. > stage 2

2) compare defining features of both the subjects and the predicate

2 types of features: defining features - necessary to the meaning of the item (robin has a red breast)

...................................characteristic features - descriptive but not essential

Rips (1973) asked Ss to indicate how closely related each of several instances was to its category (chicken, sparrow, robin, penguin) and to other instances of the category. They then translated the ratings into distances in semantic space (multidimensional scaling). 

Accounts for fuzzy set, relatedness, semantic distance

Loftus (1973) importance of noun order;

A) asked Ss to list categories that the instances belonged to; presented categories & asked for instances;
B) measured reaction time in verification task (A wren is a bird vs. A bird is a wren) According to feature theory noun order should have no effect/ it did.
Feature theory is limited in its applicability e.g. A man has a car - how can this be verified in feature theory

Model explains or account for:

1) typicality - high similarity between S & P

Does not account for:

1) category size effect - predicts the opposite

2) context effects - doesn't say how items are related

PROPOSITIONAL THEORIES

a Proposition is the smallest unit of knowledge that can stand as a separate assertion; can be judged as true or false; propositions are abstract (an idea rather than a set of words or images) ; it is a combination of concepts; rule governed

ACT*J.R. Anderson (Adaptive Control of Thought)

This model attempts to account for all cognition, including memory, language, learning, reasoning, etc. It assumes that the mind is unitary; all higher cognitive processes are different products of the same underlying system. Emphasizes control which provides direction to thought and controls the transitions between thoughts. Distinguishes between declarative and procedural knowledge.
Declarative memory - "knowing that" - contains the factual information or knowledge stored in semantic networks (Propositional networks)
Proposition - a combination of concepts (one or more arguments and a relational term or a subject/predicate link); the smallest unit of knowledge that can stand alone as a separate assertion; corresponds to the meaning of an event; abstract, an idea rather than a set of words or an image; rule governed; has truth value
Consider the sentence: Nixon gave a beautiful Cadillac to Brezhnev, who is the leader of the USSR.
The propositions are: N gave Caddy to B; Caddy was beautiful; B is the leader of the USSR
If any of these simple assertions are false then the entire sentence is false. Anderson refers to these sentences as PRIMITIVE ASSERTIONS and information is represented in memory in a way that expresses the meaning and not the wording of these assertions.
The original sentence could be rewritten in several ways and yet the primitive assertions would be unchanged
Kintsch (1974) represents each proposition as a list; the list contains a relation and its arguments: 
relations = verbs, adjectives, or other relational sentence terms (gave, beautiful, leader of )
arguments = nouns, the relations connect the nouns (Caddy, Nix, Brez, USSR)
Kintsch & Keenan (1973) Ss were given sentences to read that varied in the numbers of propositions but were approximately the same length:
"Romulus, the legendary leader of Rome, took the women of the Sabine by force"
"Cleopatra's downfall lay in her foolish trust in the fickle political figures of the Roman world."
Found: the greater the number of propositions, the longer the reading time.
Implication: propositions, not single words, are the units of comprehension
 
Organization of facts in propositional memory and their retrieval times:
1. if a fact about a concept is frequently encountered, it will be stored with that concept even if it could be inferred from a more distant concept;
2. the more frequently a fact about a concept is encountered, the more strongly that fact will be associated with that concept and the quicker it will be verified (Birds fly vs. birds walk);
3. verifying facts that are not directly stored with a concept but must be inferred takes a relatively long time.
Procedural memory - knowledge of how to do things, makes use of production systems (e.g. IF A is the mother of B and B is the mother of C THEN A is the grandmother of C - a production rule)
A production rule consists of a set of conditions and an action. The conditions are preceded by IF and the action is preceded by THEN
Working memory - contains the information that the system can currently access (information from LT declarative memory and temporary structures deposited by encoding and the action of productions). It is declarative knowledge, permanent or temporary, that's in an active state. This is where the action takes place.
Five assumptions of ACT*:
1. strength - each link in the network has a strength; the strength of a new link is low but is strengthened each time the link is used
2. activation - at any instance a small portion of the nodes in LTM are in an active state - this is working memory; the rate ofspread is a function of the strength of that path relative to the sum of the strengths of all paths coming from that same concept. The more information associated with a concept, the longer it takes to retrieve a particular association (e.g. If you know 3 things about A and only one thing about B, then it will take longer to retrieve a piece of information about A) - FAN EFFECT
3. spread of activation - activation spreads out from the active nodes to passive nodes to which it is linked
4. dampening - periodically activation is dampened throughout the network; occurs as a function of time
5. active list - a maximum of 10 nodes can be kept on the active list (working memory)

 

Connectionist Models (Parallel Distributed Processing)

(McClelland, Rumelhart, & Hinton, 1986)

 

An attempt to develop a model of higher level processing based on our understanding of neural processing. PDP models assume " that information processing takes place through the interaction of a large number of simple processing elements called units, each sending excitatory and inhibitory signals to other units"
Marr - neural computation - How might the brain achieve higher processing? Start from general knowledge of how neurons work and ask - howcould higher level functioning be achieved by connecting together basic elements like neurons?
PDP set up a network representing knowledge where each element or instance unit is a neuron linked or connected together (excitatory connections). Retrieval is the activation of the unit corresponding to the item which activates all the properties for the item thus creating a pattern of activation and the greatest activation leads to retrieval.
Inferences are based on similarity
Cognitive processes involve interactions among these units (excitation or inhibition). Many interactions occur simultaneously or in parallel. Knowledge is not stored in specific nodes. Knowledge consists of connections among simple units distributed throughout the network . Knowledge is encoded in connection strengths.
Connectionist schemes can represent information without recourse to symbolic entities like propositions; they represent information sub-symbolically in distributed representations, they have the potential to model complex behaviors. They hold out the possibility of theories of cognition that map directly onto the neurological substrateThe concept Rose in a distributed model does not have symbols that represent the rose specifically but rather stores the connection strengths between units that will allow either the sight or scent of the rose to be re-created (accounting for imaging and propositional experiences).
The sight and scent of the rose can be viewed as being coded in terms of simple signals in certain input cells (vision units and olfaction units). The network is capable of associating the pattern of activation which arrives at the vision units with that arriving at the olfaction units. The distributed representation of the sight of the rose is thus represented by the matrix of activation over the units in the network, without recourse to any explicit symbol for representing the rose.
Distributed representations are content-addressable meaning that any part of a past occurrence or scene can lead to retrieval of the whole memory. Distributed representations allow automatic generalization meaning that patterns that are similar will produce similar responses. past occurrence or scene can lead to retrieval of the whole memory. Distributed representations allow automatic generalization meaning that patterns that are similar will produce similar responses.

Back to handout page