Corpora: Re: Polish corpus

Martin Wynne (
Mon, 13 Dec 1999 17:59:51 +0000 (WET)

There is a 2 million word Polish corpus, comprising texts taken from the
Gazeta Wyborcza newspaper, and marked up with SGML according to the Parole
guidelines for corpus encoding (actually constructed by me when I was
working at Lodz University, and a member of the PELCRA corpus research
team there.)

This corpus has now been deposited with the TRACTOR archive, which will be
launched in the very near future, and will live at (and is
now being managed by me wearing my new hat as manager of TRACTOR). There
are also resources available in many other languages, especially those of
Central and Eastern Europe. A small administrative fee of 50 Euros is
charged for membership of the User Community, giving the user access to
all of the resources and to the helpdesk. Members of the research
community who deposit resources don't have to pay the fee, nor do members
of the TELRI II project (see

If anyone is interested, please feel free to take a look at the website,
but bear in mind that it is not finished yet, and you may find it more
profitable to await the official announcement of its launch in the next
few weeks.

Best wishes,
Martin Wynne

Martin Wynne Multilinguale Forschung
TRACTOR Coordinator Abteilung LEXIK Institut für Deutsche Sprache
Tel: +49 621 1581 427 R5, 6-13
Fax: +49 621 1581 415 D-68161 Mannheim
+49 621 1581 200