Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, Epiciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation
Book sections

The WWW as a Resource for Lexicography

Abstract : Until the appearance of the Brown Corpus with its 1 million words in the 1960s and then, on a larger scale, the British National Corpus (the BNC) with its 100 million words, the lexicographer had to rely pretty much on his or her intuition (and amassed scraps of papers) to describe how words were used. Since the task of a lexicographer was to summarize the senses and usages of a word, that person was called upon to be very well read, with a good memory, and a great sensitivity to nuance. These qualities are still and always will be needed when one must condense the description of a great variety of phenomena into a fixed amount of space. But what if this last constraint, a fixed amount of space, disappears? One can then imagine fuller descriptions of how words are used. Taking this imaginative step, the FrameNet project has begun collecting new, fuller descriptions into a new type of lexicographical resource in which '[e] ach entry will in principle provide an exhaustive account of the semantic and syntactic combinatorial properties of one "lexical unit" (i.e., one word in one of its uses).' (Fillmore & Atkins 1998) This ambition to provide an exhaustive accounting of these properties implies access to a large number of examples of words in use. Though the Brown Corpus and the British National Corpus can provide a certain number of these, the World Wide Web (WWW) presents a vastly larger collection of examples of language use. The WWW is a new resource for lexicographers in their task of describing word patterns and their meanings. In this chapter, we look at the WWW as a corpus, and see how this will change how lexicographers model word meaning.
Document type :
Book sections
Complete list of metadata

Cited literature [17 references]  Display  Hide  Download
Contributor : Gregory Grefenstette Connect in order to contact the contributor
Submitted on : Friday, November 7, 2014 - 9:59:55 AM
Last modification on : Friday, February 4, 2022 - 3:12:21 AM
Long-term archiving on: : Sunday, February 8, 2015 - 10:15:36 AM


Gregory Grefenstette - The WWW...
Files produced by the author(s)


  • HAL Id : hal-01081131, version 1



Gregory Grefenstette. The WWW as a Resource for Lexicography. Marie-Hélène Corréard. Lexicography and Natural Language Processing: A Festschrift in Honour of B.T.S. Atkins, Euralex, 2002, 2-9518583-0-2. ⟨hal-01081131⟩



Record views


Files downloads