Corpora: Looking for Afrikaans, Swedish, Romanian and Icelandic corpora

Larry Spitz (
Wed, 27 Aug 1997 08:54:15 -0700

I am in need of corpora in Afrikaans, Swedish, Romanian and Icelandic. I do not need much data, approximately 100,000 characters of running text would be more tan adequate. I could get by with as little as 20,000 characters, if that is all that is readily available.

I have the ECI CDROM but of these languages only Swedish is covered, and it is not really running text.

Can anyone point me to these corpora?



Document Recognition Technologies, Inc.
459 Hamilton Avenue, Suite 204, Palo Alto, CA 94301 USA
email: phone: +1-650-688-0842 fax: +1-650-688-0841