Vocabulary

If you are reading this, chances are you are a habitual reader, meaning you read on average an hour or two a day. As such, I can say with some authority, that most of the words you know, you learned through the act of reading. Research has shown that past the 4th grade, the number of words a person knows depends primarily on how much time they spend reading (Hayes & Ahrens, 1988; Nagy & Anderson, 1984; Nagy & Herman, 1987; Stanovich, 1986). In fact, by the time they reach adulthood, people who make a habit of reading have a vocabulary that is about four times the size of those who rarely or never read. This disparity starts early and grows throughout life (see M is for Matthew Effect for more on the widening disparity).
According to Beck and McKeown (1991), 5 to 6 year olds have a working vocabulary of 2,500 to 5,000 words. Whether a child is near the bottom or the top of that range depends upon their literacy skills coming into the first grade (Graves,1986; White, Graves & Slater, 1990). In other words, by the first grade, the vocabulary of the disadvantaged student is half that of the advantaged student, and over time, that gap widens.

The average student learns about 3,000 words per year in the early school years — that’s 8 words per day (Baumann & Kameenui, 1991; Beck & McKeown, 1991; Graves, 1986), but vocabulary growth is considerably worse for disadvantaged students than it is for advantaged students (White, Graves & Slater, 1990).

How important is vocabulary size? Imagine how much harder your life would be if you didn’t understand 75% of the words you currently know. How hard would it be to read a passage of text if you didn’t know many of the words in the passage? Imagine if reading the front page of the newspaper was like reading this passage of text:

“While hortenting efrades the populace of the vaderbee class, most experts concur that a scrivant rarely endeavors to decry the ambitions and shifferings of the moulant class. Deciding whether to oxant the blatantly maligned Secting party, most moulants will tolerate the subjugation of staits, savats, or tempets only so long as the scrivant pays tribute to the derivan, either through preem or exaltation.”

Would you read the newspaper if it was all like that? Would you read anything you didn’t have to? Most non-readers have difficulty decoding the individual words, but in addition, even if they can decode them, most non-readers do not understand many of the words in formal text.

Vocabulary development is a lifelong endeavor, but because of the Matthew Effect, over time, some people develop far richer vocabularies than other people. There have been various attempts to measure how many words adults know, and the estimates vary widely.

Part of the reason is that it is not clear what it means to “know” a word. Speaking personally, there are some words I am much more familiar with than others.

Consider these words: WHITE, DOG, and HOME

And compare them to these words: CALLIOPE, FOP, and BRACHIAL

I don’t know about you, but while I am certain that I “know” the first group of words, I would only say that I recognize and have some limited knowledge of the second group of words. Dale and O’Rourke (1986) described four levels of word knowledge, which they characterized with four statements:

1. I never saw the word before
2. I’ve heard of it, but I don’t know what it means
3. I recognize it in context, and I can tell you what it is related to
4. I know the word well

It is hard to say how many words I know well, much less how many words I’m somewhat or vaguely familiar with.

Also, estimating the number of words a person knows depends on what counts as a word. If DRIVE, DRIVER, DRIVES, DRIVEN, and DRIVING all count as separate words, then the estimate would be considerably larger. Carroll, Davies, and Richman (1971) created a database of English words that appear in print by counting the number of occurrences of every string of letters that was separated by a space on each side (they sampled some 5,000,000 words from a variety of published texts). They came up with 86,741 unique “words,” but, because a computer did the counting, every unique letter string was counted as a separate word — DRIVE, DRIVER, DRIVES, DRIVEN, and DRIVING were all counted as separate words. Also, because a computer did the counting, misspelled words were counted (this was 1971 — before spell-checking), and things that we would not recognize as words were also counted (e.g. “G787” and “FI–“). Toss out the misspelled and nonsense words, and you are closer to 50,000 unique “words.”

It is reasonable to say, then, that a literate adult knows somewhere around 50,000 words, if you count DRIVE, DRIVER, and DRIVES as separate words. But as I said, literacy and volume of reading is highly correlated with vocabulary size (e.g. Nagy and Anderson, 1984), so an adult that does not read habitually would have a much smaller vocabulary than an adult that reads voluminously. Nagy and Anderson (1984) estimated that an average high school senior knows 45,000 words, but other researchers have estimated that the number is much closer to 17,000 words (D’Anna, Zechmeister, & Hall, 1991) or 5,000 words (Hirsh & Nation, 1992). Surely these dramatically different estimates depend upon the three questions described above, namely, what does it mean to “know” a word, what counts as a “word,” and who counts as “average?”

What is not at all in doubt, however, is this. People who habitually read from a wide variety of texts have much, much richer vocabularies than people who do not read much. And people who have richer vocabularies find it easier to read challenging texts than people who do not have rich vocabularies. That is the conundrum — you need a rich vocabulary to read widely, and the best way to develop a rich vocabulary is to read widely. Thus, vocabulary size is both a cause of and a consequence of reading success.

So where do you start? Ideally, you start at a young age. Children who are given lifelong support for literacy skills tend to build on their successes and flourish. Children who are still struggling in the second or third grade typically continue to struggle throughout their lives unless dramatic intervention is taken.

Ideally, some day, all students will learn to read successfully at a young age, but for now, there are many older struggling readers who do not have very large vocabularies. If you have a student who is not reading well, and who thus does not read habitually, how can you enhance their vocabulary (thus making it easier for them to read habitually)? Is it possible to teach vocabulary directly?

Research studies suggest that it is possible to teach vocabulary out of context, but it is not very efficient. It is possible to teach children between 300 and 500 words a year (8 to 10 words a week) through explicit context-free instruction, but that pales when compared to the 3,000 words a year that literate children learn throughout their school years (Nagy & Anderson, 1984). And Stahl & Fairbanks (1986) found that directly teaching children dictionary definitions for words did not enhance their comprehension of a passage of text containing those vocabulary words. The definition of the word only provides a superficial understanding of the word, and the level of word knowledge necessary to enhance comprehension is deeper than mere definitions.

Unfortunately, research has shown that the best approach to teaching vocabulary is to teach children some strategies for learning the meaning of words in context, and then encourage them to read voluminously and from a wide variety of texts and genres (Kuhn & Stahl, 1998). Teachers can also help students to develop a deeper understanding of words through some direct instruction that involves talking about the definitional and contextual meanings of words, focusing on synonyms and antonyms, providing examples and non-examples, and discussing the subtle nuances and differences that make synonyms somewhat different (e.g. the difference between kill and murder has to do with intent and crime. You can’t murder a pig or a deer, and you can’t accidentally murder a person.)

Children also need to encounter words frequently in a variety of contexts in order to internalize them. McKeown, Beck, Omanson, and Pople (1985) found that children did not really know and understand words they had only encountered four times, but they did know and understand words they encountered twelve times. Teachers can be strategic about introducing new vocabulary to students repeatedly, and providing a rich discussion and analysis of the words to enhance understanding.

And finally, I would personally argue that it is important that students play a very active role in vocabulary development. There are many words that I have passively read in context repeatedly, but I became much more familiar with those words when I actively used them in my writing. Every day, I receive an e-mail from the “word a day” service that features a very rare and unusual English word. Every day, I dutifully open the e-mail, and read the word and its definition. But the only words that stick with me are the ones that I actually use in my writing. A few days ago, I happened to get the word “idiopathy” at a time when I happened to be writing an essay about different forms of dyslexia. The word “idiopathy” was perfect for describing certain forms of acquired dyslexia that have no known origin or source. Now I have internalized that word, but there are probably 300 words that have been e-mailed to me by this service in the past year that I have never used again, nor would I understand if I encountered them in context. Typically, to make a word part of my personal lexicon, I need to actively use the word in my writing and speech, and I need to use it on more than one occasion. I would argue that vocabulary instruction should be designed with that in mind.

For more on the subject of vocabulary instruction, I strongly encourage you to read Steven Stahl’s excellent book, “Vocabulary Development.” This is a very short but highly informative book that describes research findings and has suggestions for classroom instruction.

ADDENDUM: September 25, 2005
Vocabulary Development

A few weeks ago, somebody wrote me and asked me how many words a typical adult knows. Alas, I am not a very organized person, and apparently I misplaced that person’s e-mail, but the question is a good opening to discuss vocabulary knowledge.

Before we can say how many words a typical adult knows, we have to define what a word is, and what it means to “know” a word. Oh yeah… and we have to define “typical.”

Defining what a word is is a bit of a challenge. There are words like “dance” “dancing” “dancer” “dances” and “danced” that can be counted as five different words, but that seems vaguely inappropriate. If I gave you a new word that you have never heard before — “prieve” — and told you that it was a verb, you would probably be able to generate other forms of that verb without any difficulty — “prieved” “priving” “prieves” etc. And you could probably guess that somebody who “prieves” is a “priever.” One could argue that at the heart of it, you only learned one new word, and you already knew the rules for using that word in different forms. If you follow that line of thought, though, you have to deal with irregular verbs — when you learn the word “go” you don’t automatically learn the word “went.” There are also words with common roots that give linguists pause — is “know” the same as “knowledge” or “acknowledge”? What about compound words? If you already know “side” and “walk,” should “sidewalk” get counted separately? What about proper nouns? Should “Sebastian” be counted as a word that I know? All of this just gives me a headache.

Rather than talk about specific words, linguists often talk about “word families,” but there is not 100% consensus as to what a word family is. It is very clear that “dance” and “dancing” belong to the same word family, but it is less clear if “know” and “acknowledge” belong to the same word family. Still, those are fairly unusual cases, and most linguists agree that there are somewhat more than 50,000 but probably less than 60,000 word families in the English language.

Most of those word families are completely unfamiliar to most people. There are thousands of words like “dramaturg” and “odeum” and “iracund” that almost no adult is familiar with. They are indeed real English words, and you will probably find them in your unabridged dictionary, but chances are you have never encountered them in your life before today. There are also thousands of words that almost every speaker of English is very, very familiar with — words like “green” and “today” and “dinner.” There is no question that you know those words — everybody who speaks English knows those words. However, there are also a few thousand words that you are only vaguely familiar with. These words are different for different people, so I will just guess and hope I don’t get victimized by my example. You may have encountered words like “adroit” and “egregious” and “lucent,” and you might even have a vague notion about what they mean, but I’m betting you are not as confident about your knowledge of “lucent” as you are about your knowledge of “shiny.”

In English (as in any language), there are some words that are extremely common, and everybody knows them — “green.” There are other words that are extremely rare, and almost nobody knows them — “guttle.” But then there are these middle words — “egregious.” They are fairly rare, and somewhat nuanced, but some people know them very well, and other people don’t know them well at all. Every individual person has a private collection of rare words that they know well — I, for example, love the word “defenestrate,” and try to use it in conversation whenever I can (as I just did). Your mechanic is probably quite familiar with words like “camber” and “bushing.” Your plumber uses words like “ferrous” and “petcock,” and he or she knows at least two definitions for the word “dope.”

This is why it is so hard to define how many words a “typical” adult knows. There are about 5,000 to 7,000 common word families that almost everybody knows. And there are probably 20,000 word families that almost nobody knows. But there are between 10,000 and 20,000 word families that some people know and other people don’t. How many of these semi-rare words a particular person knows depends on several things. How much does that person read every day? What level of education did that person achieve? What does that person do for a living? What kind of family background does that person have?

Somebody who did not get much of an education and does not make a habit of reading may only be really familiar with 5,000 to 10,000 word families. Somebody who has a college education and reads a fair amount may have a working vocabulary of closer to 20,000 word families. Somebody who reads voraciously and has more of an academic career may be familiar with 25,000 or 30,000 word families.

True story: I was listening to reading and vocabulary expert Anne Cunningham give a talk a few years ago — she was telling the audience that everybody should read more because the vocabulary used in literature is far, far richer than the vocabulary used in conversation or dialog. The vocabulary used on television or in conversation tends to be very limited. I was dutifully taking notes for the first half of her talk, but I realized about half way through her talk that her presentation was peppered with a very rich vocabulary. (This was a bit ironic given the point she was trying to make.) I found myself writing down the rare words she was using. I did not catch all of them, but in the last 15 minutes of her talk, she quite comfortably used these words: provoke, maneuver, equate, invariably, exposure, dominance, participation, multiple, subgroups, relatively, differentiated, significant, separately, increased, hypothesis, explore, contribution, control, observe, effect, examine, variable, interest, intervene, exposure, consequence, aspects, potent, mismatch, correlation, discrepant, contemporaneous, acquisition, analysis, implemented, comprehension, summary, variety, cumulative, phenomenon, divergence, hypothesized, efficacious, cognitive, caveat, displace, prerequisite, encouraging, despair, malleable, partially, ilk, and travesty.

I bet Scrabble night at the Cunningham house is a hoot.

Is Anne Cunningham typical? Clearly not. But it is hard to say exactly what is typical. Most people in this country don’t read very much, so they probably have vocabularies closer to the 10,000 end of the scale, maybe even closer to 5,000. I have seen estimates that a “typical” college graduate is probably familiar with 20,000 word families, but again, even among that population, there is probably a great deal of variability. People who read 3 to 4 hours a day are probably familiar with more than 25,000 word families, but very, very few people actually read 3 to 4 hours a day.

Somebody who dropped out of high school and does not read may only know 5,000 or 6,000 word families.

Somebody who finished high school and is able to read, but doesn’t really make a habit of it may know closer to 10,000 word families.

Somebody who went to college and can read well, and makes a habit of reading popular books and magazines may know 15,000 word families.

A college graduate with a more “white collar” job may have a vocabulary of 20,000 word families — almost 4 times as large as the unfortunate soul who dropped out of high school.

And of course, somebody with an advanced degree and an academic job could be familiar with 25,000 word families or more.

I will leave it to you to decide what is typical.

References:
Baumann, J.F. and Kameenui, E.J. (1991). Research on vocabulary instruction: Ode To Voltaire. In J. Flood, J.J. Lapp, and J.R. Squire (Eds.), Handbook of research On teaching the English language arts (pp. 604-632). New York: MacMillan.

Beck, I.L. and McKeown, M.G. (1991). Social studies texts are hard to understand: Mediating some of the difficulties. Language Arts, 68, 482-490.

Carroll, J.B., Davies, P., and Richman, B. (1971). The American Heritage Word Frequency Book. Houghton Mifflin, Boston.

Dale, E. & O’Rourke, J. (1986). Vocabulary building. Columbus, OH: Zaner-Bloser.

D’Anna, C.A., Zechmeister, E.B., & Hall, J.W. (1991). Toward a meaningful definition of vocabulary size. Journal of Reading Behavior, 23, 109-122.

Graves, M.F. (1986). Vocabulary learning and instruction. In E.Z. Rothkopf (Ed.), Review of Research in Education, 13, 49-89.

Hayes, D.P. and Ahrens, M.G. (1988). Vocabulary simplification for children: A Special case of “motherese”? Journal of Child Language, 15(2), 395-410.

Hirsh, D. & Nation, P. (1992). What vocabulary size is needed to read unsimplified texts for pleasure? Reading in a foreign language, 8, 689-696.

Kuhn, M.R. & Stahl, S.A. (1998). Teaching children to learn word meanings from context: A synthesis and some questions. Journal of Literacy Research.

McKeown, Beck, Omanson, and Pople (1985)

Nagy, W.E. and Anderson, R.C. (1984). How many words are there in printed English? Reading Research Quarterly, 19, 304-330.

Nagy, W.E. and Herman, P.A. (1987). Breadth and depth of vocabulary knowledge: Implications for acquisition and instruction. In M. McKeown and M. Curtis (Eds.), The Nature of Vocabulary Acquisition, (pp. 19-35). Hillsdale, NJ: Erlbaum Associates.

Nagy, W.E., Herman, P., and Anderson, R. (1985). Learning words from context. Reading Research Quarterly, 19, 304-330.

Stahl, S.A. & Fairbanks, M.M. (1986). The effects of vocabulary instruction: A model-based meta-analysis. Review of Educational Research, 56(1), 72-110.

Stanovich, K.E. (1986). Matthew Effects in Reading: Some consequences of individual Differences in the acquisition of literacy. Reading Research Quarterly, 21, 360- 407.

White, T. G., Graves, M. F., and Slater, W. H. (1990). Growth of reading vocabulary in diverse elementary schools: Decoding and word meaning. Journal of Educational Psychology , 82 (2), 281-290.