Re: Corpora: Latinate words in corpora

Peter Littlechild (peter.littlechild@swift.com)
Wed, 20 Oct 1999 12:30:16 +0200

You could just look down the published frequency lists for something like the
Lancaster/Oslo-Bergen corpus and identify which words are of Latin origin. I
wouldn't advise it for the whole list, but it shouldn't be too much work to get
a top ten like that.

Peter Littlechild
S.W.I.F.T. sc

Chris Allen wrote:

> I wondered if anyone on this list could help me with an enquiry.
>
> A student of mine is interested in obtaining frequency information for
> Latin words using a corpus. In particular, she would like to come up with a
> top 10 list of the most frequent Latinate words in English.
>
> Does anyone know of a corpus which is in some way 'tagged' according to
> etymological origin? The only thing I can remotely think of would be the
> dictionary database of a historically-orientated dictionary such as the OED
> which might be able to supply such etymological information.
>
> Thanks for your help,
>
> Chris Allen
> University of Halmstad
> Jarnvagsgatan 8b
> 302 49 Halmstad
> Sweden
> Tel: +46 35 1012 96(home)
> +46 35 167372 (work)