Re: Corpora: Summary of POS tagger evaluation

Philip Resnik (
Tue, 9 Feb 1999 11:15:15 -0500 (EST)

> This raises an issue which is slightly more complex: if you exclude
> punctuation (presumably on the grounds that a comma is always tagged
> as `comma' and there is no ambiguity), why include other unambiguous
> tokens in the scoring? If `the' always gets assigned `DET', and no
> other tags for it are possible, then why count it and not the comma?
> ...
> A tagger's performance can only be measured sensibly if some indicator
> of the complexity of the tagging task is given. [shameless plug follows:]
> Dan Tufis & myself have proposed an evaluation metric based on the
> average number of tags per token in a paper at the LREC conference last
> year. Here each percentage reporting the tagging accuracy would be
> augmented by a factor indicating the difficulty of the task. 90% on a
> highly ambiguous text might then show a better performance than 96% on
> a simple text with few ambiguous tokens.

This seems like a sensible idea. Not to let one shameless plug go by
without a second in kind, :-) this is an issue that comes up even more
forcefully in evaluating word sense tagging, in comparison to which
the POS-tagging community seems virtually standardized. David Yarowsky
and I made a proposal addressing this and other evaluation issues that
was adopted, with modification and elaboration, in last fall's
SENSEVAL exercise, organized by Martha Palmer, Adam Kilgarriff, and
Joseph Rosenzweig (soon to be covered in a special issue of _Computers
and the Humanities_). Some of the points we made referred
specifically to word sense disambiguation, but others, in particular a
perplexity-like evaluation metric, would seem to apply equally well
here. The original paper (in the 1997 ANLP SIGLEX workshop) is

<A HREF="">
Philip Resnik and David Yarowsky, A perspective on word sense
disambiguation methods and their evaluation"</A>, position paper
presented at the <A
SIGLEX Workshop on Tagging Text with Lexical Semantics: Why, What, and
How?</A>, held April 4-5, 1997 in Washington, D.C., USA in conjunction
with ANLP-97. <P>

and the SENSEVAL page is

Philip Resnik, Assistant Professor
Department of Linguistics and Institute for Advanced Computer Studies

1401 Marie Mount Hall UMIACS phone: (301) 405-6760
University of Maryland Linguistics phone: (301) 405-8903
College Park, MD 20742 USA Fax : (301) 405-7104 E-mail: