OCP & lemmatisation?

Katja Lenz (Katja.Lenz@Uni-Koeln.DE)
Tue, 16 Jul 1996 17:19:50 +0200 (MST)

Dear `Corpora'-subscribers,

I'm a postgrad student - not too computer literate (so try and keep it
simple, please) but at least interested. I have a problem using Micro-OCP
and I would like to know whether there's maybe a newer version of it available
(I think mine is from 1988) or if you can help me with my problem (or
know someone who might...).

For my PhD-thesis I'm currently analysing Scottish drama texts and trying
to find out about their usees of Scots dialect(s) (forms and functions).
I would like to be able to make meaningful statements about the density of
Scots vocabulary in the individual texts, but I can only do so (I think)
if I can lemmatise the texts, so the concordance will not count spelling
variations between Scots and English as individual words (same goes for
inflexional forms which I would like to group under one headword rather
than having them count as separate types).

My version of OCP allows only a limited number of such headwords to be
defined in the commands file. The manual says: up to three hundred. For
the text I'm trying to analyse just now I have about four hundred - and
there are bound to be more in some of the other texts... What can I do? I
would like to stick with OCP (not least because I'm lazy and wouldn't
like to have to change software and textformat again).

I have checked the WWW and found LEXA. But that's too expensive for me,
and I'm also not quite sure whether it can do both (lemmatisation and
concordancing). I certainly don't want to deal with two enemies at the
same time...

Can you help at all? Thank you for a reply,

Best wishes,

Katja Lenz