Corpora: English corpus

Jeff Allen (jeffa@odessa.mt.cs.cmu.edu)
Fri, 17 Jul 98 12:34:43 EDT

Marie Stevenson writes:
> I work at the faculty of Pedagogical Science at in the University of
> Amsterdam in the Netherlands. I am developing a receptive vocabulary test
> for English. To help me do this, I am looking for a list of the 5000 most
> frequent words in the English language.
> Is there anybody on this planet who can help me with this quest?

Beyond the first 10 most frequent words in the language, it all
depends on your domain. If you can supply us with texts, or at least
give us the exact domain that you are trying to work in, we can get
you a frequency list in about an hour. We create frequency lists
in our research on several languages, so it would be easy to adapt
our algorithm to your texts or to texts that can be collected in your
domain of research.

Regards,

Jeff Allen

oooooooooooooooooooooooooooooooooooo
Jeff ALLEN - Research Linguist
DIPLOMAT Project
Language Technologies Institute &
Center for Machine Translation
CARNEGIE MELLON UNIVERSITY
Pittsburgh, Pennsylvania 15213-3890
Tel: (+1) 412-268-6593
Fax: (+1) 412-268-6298
E-mail: jeffa@cs.cmu.edu
oooooooooooooooooooooooooooooooooooo