I send you Bob's e-mail:....
From: Bob Krovetz <krovetz@research.nj.nec.com>
to:mar0074@ibm.net
You could address your message to the corpora list: corpora@lists.uib.no
It is a difficult question to answer though - how do you count? What do
you do about predictable morphological variants - does that count as a
new word? What about a word that has an embedded space (such as "white
house",or "operating system"). Do they count as one word or two? What
about
proper nouns? If you don't want to count them, how are you going to
exclude
them? I think you can see some of the problems in answering your
question.
Spanish clearly has a richer morphological system than English, and the
vocabulary would therefore have a greater proportion of morphological
variants.
But I *think* English borrows words from other languages more freely than
does Spanish.
Bob