Language Evolution

Language change and evolution

Dr. C. George Boeree

Languages change, usually very slowly, sometimes very rapidly. There are many reasons a language might change. One obvious reason is interaction with other languages. If one tribe of people trades with another, they will pick up specific words and phrases for trade objects, for example. If a small but powerful tribe subdues a larger one, we find that the language of the elite often shows the influence of constant interaction with the majority, while the majority language imports vocabulary and speaking styles from the elite language. Often one or the other simply disappears, leaving behind a profoundly altered "victor." English is, in fact, an example of this: The Norman French of the conquerers has long disappeared, but not before changing Anglo-Saxon into, well, a highly Frenchified English.

If a people are isolated on islands or mountain valleys, language can change very slowly indeed. But it still changes. For example, in the highlands of Papua New Guinea are many dozens of languages, each quite different from its neighbors. But they are apparently the results of long-term isolation rather than mutual influence. The same has happened in the Caucasus Mountains between Russia and Turkey and Iran, among the American Indians of the Pacific coast, and so on.

The slower mechanisms of change seem to include the "battle" between simplicity and expressiveness. We want our languages to communicate as much information as possible, and yet do so economically. We want our languages rich yet concise. How many prepositions or cases do we need? How many are too much? How many verb forms do we need, and how many strain the brain? How many suffixes, prefixes, and irregularities can children take before they begin to simplify? What combinations of sounds are easily pronounced and easily understood? And so on.

One surprising aspect of language change is the influence of fashion and even of individual idiosyncracies. Although the story is apocryphal, some say that the th pronunciation of Castillian Spanish was due to courtiers imitating the lisp of a young king! In my own family, we refer to Christmas as Wikis because of one of my children couldn't say Christmas. Imagine if we were a part of a tightly knit tribal village: If others thought it was as cute as we did, the word Christmas could morph into Wikis in one generation! That has probably happened millions of times in human history.

Let's look at a real example of one very influential people:

Around 5000 bc, between the Danube river valley and the steppes of what is now the Ukraine, there lived small tribes of primitive farmers who all spoke the same language. They cultivated rye and oats, and kept pigs, geese, and cows. They would soon become the first people on earth to tame the local wild horses -- an accomplishment that would make them a significant part of history for thousands of years to come. And their proximity to the culturally more advance people of Asia Minor - what is now Turkey - would allow them to learn the metal working invented there, beginning with copper.

Beginning around 3000 bc, these people would spread into Europe and the Russian steppes. Around 1500 bc, they would continue into Persia and India, even as far as western China. Later still (in the last 500 years), they would spread to the Americas, Australia, the Pacific islands, and parts of Africa. They would take their language with them, although it would gradually change into hundreds of mutually unintelligible languages, including English, German, French, Spanish, Russian, Persian, Hindi and many more.

By examining the oldest examples of modern and classical languages such as Greek, Latin, and Sanskrit, linguists have been able to reconstruct an educated guess as to what the language of these ancient people was like. They call the language Proto-Indo-European. The work that went into reconstructing Proto-Indo-European has led to efforts to reconstruct other prehistorical language ancestors as well.

To show you how these linguists did this, let's start with a simple example: Italian, Spanish, Portuguese, French, and Rumanian all come from Latin, which we still have many records of. Words with -ct- in the middle in Latin changed in a systematic way, like this:

Latin	Italian	Spanish	Portuguese	French	English
dicto	detto	dicho	dito	dit	said
lacte	latte	leche	leite	lait	milk
lecto	letto	lecho	leito	lit	bed
nocte	notte	noche	noite	nuit	night

So one "rule" could be that a "difficult" combination of letters like -ct- change in certain ways to end up "simpler." In most of the descendent languages, it just became -t-; in Spanish, it became ch. Another example: Words that began with pl-, cl-, or fl- in Latin changed in a systematic way as well. In this case the initial consonant combinations "simplified" in different ways in Italian, Spanish, and Portuguese, but remained the same in French. In Italian, the l became an i, in Spanish they became ll (pronounced like y), and in Portuguese they became ch (pronounced like sh):

Latin	Italian	Spanish	Portuguese	French	English
pleno	pieno	lleno	cheio	plein	full
clave	chiave	llave	chave	clef	key
flamma	fiamma	llama	chama	flamme	flame

The relationships among the Germanic languages are often obvious, and linguists have reconstructed what they call Proto-Germanic:

English	Dutch	German	Danish	Norwegian	Swedish	Icelandic
book	boek	buch	bog	bok	bok	bök
come	komen	kommen	komme	komme	komma	koma
drink	drinken	trinken	drikke	drikke	dricka	drekka

And among the Slavic languages, the relationships are more obvious still, and they have reconstructed a Proto-Slavic:

English	Russian	Belarus	Ukrainian	Polish	Czech	Slovak	Slovenian	Serbo-Croatian	Makedonian	Bulgarian
mother	mat'	maci	maty	matka	matka	matka	mati	mati	majka	maika
faith	vera	vera	vira	wiara	vira	viera	vera	vjera	vera	vjara
dream	son	son	son	sen	sen	sen	san	san	son	san

Over time, the linguists learned the patterns of change, and have used them to reconstruct languages whose original versions we no longer have any record of -- such as Proto-Indo-European! They are able to use some of the oldest versions of the different branches of the Indo-European languages as a foundation:

English	Sanskrit	Greek	Latin	Old Irish	Gothic	Lithuanian	Old Church Slavic
four	chaturtha	tettares	quattuor	cethair	fidwor	keturi	chetyre
five	pancha	pente	quinque	coic	fimf	penki	peti
mother	maatra	mater	mater	mathir	modhir	mote	mati
brother	bhrataa	phrater	fratera	brathair	brothar	brolis	bratu

These examples are nowhere near as obviously related -- but they are, in fact, related. The words for brother are clearer than the others: You can see that the first sound varies between b, bh (a breathy b), ph (a breathy p), and f. The first vowel varies between a and o. The middle consonant varies between t and th. In all but the last two languages, the words end in some variation of ar or er. Notice that the examples include Sanskrit (ancestor of the languages of northern India), Greek, Old Irish, and Lithuanian! Gothic is the oldest recorded version of the Germanic languages, and Old Church Slavic the oldest of the Slavic languages. There are, in fact, even more relatives, including Albanian, Armenian, the languages of Iran, and many languages which haven't survived.

By examining the patterns in many languages and many words, linguists have reconstructed the Proto-Indo-European forms of these and many other words:

Proto-indoeuropean

kwetwer

penkwe

mater

bhrater

For a few more examples, here are the reconstructed Proto-Indo-European numbers from one to ten:

oino, dwo, trei, kwetwer, penkwe, sweks, sept, oktou, newn, dekm.

Look vaguely familiar?

Linguists have reconstructed other "Proto" languages for other language families. Some, such as the Polynesian languages, are relatively easy, because those languages only diverged about 1000 years ago. Others are nearly impossible, either because of a lack of older written material, or because it isn't even certain that the languages are truly related!

Many linguists believe that it is hard to go much further than 5000 years, even with a good set of vocabularies to work with. In fact, many suggest that over 10,000 years, the changes that occur are so thorough that no clear connection can be established between two languages that separated that long ago. But saying something is impossible has never stopped us before! Some other linguists have indeed taken the leap and used certain specialized statistical tools to project back to a language that (supposedly) is the ancestor not only of Proto-indoeuropean, but of the language groups Afro-Asiatic (e.g. Arabic, Hebrew, and ancient Egyptian), Uralic (e.g. Finnish and Hungarian), Altaic (e.g. Turkish and Mongolian), Dravidian (the languages of southern India), Korean, Japanese, the languages of eastern Siberia, and Eskimo-Aleut!

They call the reconstruction Nostratic (meaning "ours"), and suggest that it may have existed some 15,000 to 20,000 years ago. Some examples of words that may have been a part of Nostratic include küjna (dog), p'at (foot), haku (water), and küni (woman). Perhaps you recognize them from words like canine (and hound!), pedicure (and foot!), aqua (and water!), and gynecologist (and queen!).

To do this, some linguists have used a different set of techniques. Instead of looking at a vast collection of words, they look at a smaller collection of words that have shown a certain stability in languages such as the Indo-European languages. They then look at statistical patterns over a large number of languages. It is techniques like this that have allowed linguists to suggest, for example, that most North and South American Indian languages are part of a language group they called Amerindian -- something the older, more meticulous methods could not do, and many linguists still do not trust.

But wait! Why stop there? Take a look at the reconstructed words from Proto-Amerindian for dog, foot, water, and woman: akuan, pet, haku, and kuna! These look like the Nostratic küjna, p'at, haku, and küni, don't they? Some linguists go way out on a limb and suggest that we can actually reconstruct at least a little bit of what they call Proto-World, presumed to exist perhaps 100,000 years ago! Kujan is the suggested proto-world for dog; pat is foot, haku is water; kuni is woman. It sounds unbelievable, and that's exactly what the majority of linguists think it is: unbelievable. There are way too many opportunities for false similarities to creep in and distort the results!

Nevertheless, it is likely that, "once upon a time," there was indeed only one language, one with a limited vocabulary and simple rules for combining words into sentences. As the need arose, the vocabulary could expand by combining old words or inventing new ones, and the rules could become more and more detailed. At some point, long ago, the vocabulary and the grammar apparently leveled off: All languages today, no matter how "primitive" the people, appear to be equal in their abilities to express the nuances and complexities of human life.