Re: Corpora: lemma vs lexeme

Christer Samuelsson (Christer.Samuelsson@xrce.xerox.com)
Wed, 10 Nov 1999 15:48:08 +0100 (MET)

> Date: Fri, 05 Nov 1999 08:25:52 -0500
> From: "Kenneth W. Church" <kwc@research.att.com>

> Tony is of course quite right. There was a lot of activity on part of speech
> tagging in the late 80's and several taggers from that period (including my
own)
> are still being used quite a bit. A number of additional taggers have become
> available since that time. It is hard to say that the newer taggers are
better
> than what came before. Evaluation is quite tricky. None of these taggers
work
> as well as we would like, but the standard evaluation methods are hard pressed
to
> say that one tagger is much better than another. Some taggers claim to be
more
> accurate than others and some don't. The jury is still out on the accuracy
> question.

We performed a pretty solid experimental comparison between
an HMM-based tagger and the English Constraint Grammar Parser
of Helsinki (EngCG-2), see

http://www.ling.helsinki.fi/~avoutila/cg/index.html

on a common disambiguation task. The latter outperformed the
former with a wide margin. The experiments are written up in
detail in Samuelsson & Voutilainen, (E)ACL'97, pages 246-253.
This article can also be obtained from the above web site.

Christer Samuelsson
Atro Voutilainen