Odp: Corpora: Seeking Polish Corpus / Szukam Polski(ego?) Narodowy Korpus

Tadeusz Piotrowski (tadpiotr@ii.uni.wroc.pl)
Mon, 13 Dec 1999 14:01:17 -0000

I am afraid there is no Polish National Corpus. The largest corpus I know,
and which aspires to become one, is that at Lodz University (get in touch
with Professor Barbara Lewandowska-Tomaszczyk blt@krysia.uni.lodz.pl.)
The Poznan people, who wrote the Collins dictionary, if they used a corpus,
which is not certain, probably had an opportunistic corpus, as it is
euphemistically called (i.e. whichever texts they could lay their hands on),
and other people usually have the same. There some commercial corpora, not
yet very well developed, as the one at PWN, though I doubt very much whether
they will let you use it for free, if at all. The safest way for you, I
suppose, is to download some texts from the Internet. The samples will not
be balanced at all, but for teaching that is of no consequence. For general
portals see www.onet.pl or www.wp.pl.
Microconcord (was it ever a shareware product? as far as I know it was sold
by OUP and now it is given away for free by the authors) is OK if you are
not troubled by the inadequate sorting of Polish characters. A fairly cheap
software that handles Polish remarkably well is WordSmith, available from
OUP, or directly from Mike Scott. A free product which can be just fine for
you, because it sorts Polish well, but handles only shorter texts, is
Concordancer for Windows. It can be downloaded from the Internet, though I
do not have the URL just now.
Tha authors are:
martinek@top.cz
(Zdenek Martinek, Masarykova 4, 312 19 Plzen, Czech Republic).
siegrist@hrz1.hrz.th-darmstadt.de
(Prof. Dr. Leslie Siegrist, TH Darmstadt, FB2, Institut für Sprach- und
Literaturwissenschaft, Hochschulstr. 1, 64289 Darmstadt, Germany)
Hope that is of some use.
Best
Tadeusz Piotrowski
***************************************************************
mailing address
Department of English
Opole University Zielinskiego 47/11
Oleska 48 PL-53-533 Wroclaw
Opole
POLAND
phone/fax (+48)71-3382664

----- Original Message -----
From: F <fdavidson@imaris.demon.co.uk>
To: <corpora@hd.uib.no>
Sent: Monday, December 13, 1999 1:49 AM
Subject: Corpora: Seeking Polish Corpus / Szukam Polski(ego?) Narodowy
Korpus

> Dear all,
> I have recently subscribed to this list. I am not involved
> professionally in linguisics.
>
> I am looking for current, or fairly recent Corpora of the Polish
> language to assist in learning, by focusing my learning effort in
> everyday language rather than what lecturers think everyday language
> should be. A subset of the Polish National Corpus would do as I don't
> expect my Polish vocabulary to exceed a few thousand words in the near
> future. Also are there any (freeware or shareware) concordancers and
> taggers I can get on the net to analyse Polish text?
>
> The Collins Polish/English/Polish Dictionary I a using says it uses the
> 'Bank of English' Corpus as the basis for word inclusion. It doesn't say
> which Polish one it uses.
> I have a shareware concordancer (Wordsmith) but I don't know if it works
> yet with Polish.
>
> Any help would be appreciated.
>
> --
> Frank Davidson
>
>