Link to the University of Pittsburgh
Jesus College, University of Cambridge

Resources and Methods

Note: Some of the sites to which links are given here require subscription. Others may require registration.

OED as a ‘keywords’ resource

The OED is the Oxford English Dictionary.

The dictionary was first published, in parts, as A New English Dictionary on Historical Principles, eds, J.A.H.Murray, H.Bradley, W.A.Craigie & C.T.Onions (OUP, Oxford, 1884–1928); re-issued (1933); 2nd edition eds, John Simpson and Edmund Weiner (1989); 3rd edition in preparation. Histories of the dictionary include the memoir by K.M.Elisabeth Murray, Caught in the Web of Words: James A.H.Murray and the Oxford English Dictionary (Yale University Press, New Haven, 1977) and, more recently, Simon Winchester, The Meaning of Everything: the story of the Oxford English Dictionary (OUP, Oxford, 2003). For a highly detailed account of what each OED entry contains and how to read them, see Donna Lee Berg, A Guide to the Oxford English Dictionary (OUP, Oxford, 1993).

The Raymond Williams Society

Further material about Raymond Williams' life and publications can be found on the society's website , which was established in 1989 and 'exists to support and develop intellectual and political projects in areas broadly connected with Williams's work'. Among other resources the site gives information about the journal Key Words, published by Spokesman Books, whose first issue (which contains a detailed analysis of the concept of a keyword by Deborah Cameron) can be downloaded free.

List of relevant corpora

Frequent reference on this site is made to data provided by Google Books. Such data typically take the form of Ngrams.

A detailed paper explaining the development of the Google corpus and its possible usefulness in researching a range of social science questions is:

Jean-Baptiste Michel*, Yuan Kui Shen, Aviva Presser Aiden, Adrian Veres, Matthew K. Gray, William Brockman, The Google Books Team, Joseph P. Pickett, Dale Hoiberg, Dan Clancy, Peter Norvig, Jon Orwant, Steven Pinker, Martin A. Nowak, and Erez Lieberman Aiden. ‘Quantitative Analysis of Culture Using Millions of Digitized Books’. Science (Science 14 January 2011: 176-182.Published online 16 December 2010 [DOI:10.1126/science.1199644])

In that paper, the authors report how they constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of that corpus, they claim, enables researchers to investigate cultural trends of many different kinds quantitatively. Their paper seeks to show in brief how an approach based on Ngram analysis can provide insights about fields including lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. They call the resulting field of enquiry “culturomics”, and claim that their approach allows the boundaries of rigorous quantitative inquiry to be extended to a wide array of new phenomena spanning the social sciences and humanities.

A particularly useful means of accessing a corpus of linguistic data is the corpus based on Google Books compiled by Mark Davies at Brigham Young University. This corpus consists of 155 billion words of US English and 34 billion words of British English. It can be found at:

The BYU corpus is based on n-grams provided from Google Books. When users search the corpus, they are in fact being assisted in their search by the Google n-grams rather than actual Google Books (i.e. the sentences, paragraphs, and pages of the texts themselves). However, the frequency lists that result from any given search do contain links to Google Books, so that it is possible to see occurrences of the search term in use in relevant texts.

You can click on any word or phrase and see it, in context, in a work as scanned for Google Books. Or you can click on a number in the table provided in order to see the matching word as used only during that specific decade you are interested in. In some browsers, words or decades that you have already looked at will be highlighted in red.

Concordancing freeware

An effective freeware concordance program for Windows, Macintosh OS X, and Linux is AntConc (homepage includes previous versions, tutorials, and help). AntConc was developed by Laurence Anthony, Professor and Director of CELESE, Center for English Language Education, Faculty of Science and Engineering, Waseda University, Japan.

Historical Thesaurus of the OED

The Historical Thesaurus of the OED is an additional tool which forms part of the online OED. It charts the semantic development of the vocabulary of English, and is claimed to be the first comprehensive historical thesaurus produced for any language. The Thesaurus contains almost every word in English from Old English to the present; it lists 800,000 words and meanings, in 235,000 entry categories, and offers in effect a complete sense inventory for English based on the second edition of the OED. You can use the Historical Thesaurus to find synonyms for individual words in the OED (then trace their development over time). Or you might chart the linguistic progress of a chosen object, concept, or expression, with links to the OED definition of each new word used for it.

Keywords word-lists

For a grid listing headwords in the first edition of Raymond Williams’ Keywords (1976), as well as additional headwords included in the second edition (1983), and in Tony Bennett et al’s New Keywords (Blackwell, 2005), view list. Adobe Acrobat Icon

For a list of proposed keywords to be investigated, as drawn up by the Keywords Project team between 2007 and 2009, view list. Adobe Acrobat Icon.

For a list of proposed keywords put forward by participants at a presentation on keywords at the Modern Languages Association (MLA) conference in Los Angeles, January 2011, view list. Adobe Acrobat Icon