Parsers/taggers -- free and good?

Ray Liere (lierer@mail.CS.ORST.EDU)
Wed, 26 Jun 1996 11:43:29 -0700

I am interested in your opinions and *especially* your experiences
using any parser/tagger for English that has these characteristics:
- handles free text (by which I mean not in any way cleaned up -- such
as from newswires, technical reports, manuals, etc.)
- free
- source code is available, preferably in C or C++. I need source so that
I can port it to Linux (a version of UNIX that runs on PCs).
- reasonably accurate and easy to use

The ability to handle (ignore?) structure tags, such as title, body of
document, etc., is not a big deal, as I think that they would be easy
to strip out in a preprocessing step

I realize that there have been a few relevant postings to this mailing list
of late -- a list of parsers was posted with respect to investigations
on the use of them on PCs under windows in April 1996, and Miles Osborne
posted a response suggesting the Brill parser. Unfortunately, other
responses were emailed to the original poster, but no copy was sent
to corpora and no summary was posted.

It seems that the topic of "what is a good cheap parser" comes up
periodically, so I would like to volunteer to gather people's
experiences -- good and bad -- and then post them.

Email your thoughts to me if you prefer (to save bandwidth) -- I will
post a summary of responses that I receive via email. If you prefer to
have your comments summarized anonymously, please indicate this
in your email.

Thanks.

Ray Liere
Department of Computer Science
Oregon State University, Corvallis, Oregon, USA
lierer@mail.cs.orst.edu