Corpora: looking for trained WinBrill files

johan.hagman@jrc.it
10 Feb 1999 10:29:26 +0100

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Hi, out there!

I just got into this list and am trying it for the first time.

Some time ago I dowloaded two Brill tagger packages
(from http://member.nifty.ne.jp/htakashi/dos/ and
ftp://ftp.cs.jhu.edu/pub/brill/Programs/RULE_BASED_TAGGER_V.1.14.tar.Z).

for and from the latter I just copied the lexicon and rule files
(since I didn't see any on Takashi's site). Now, the problem is the
fairly low-quality result I obtain when trying this kit even on simple
English phrases. Surprisingly I get 'were' erroneously tagged as an
NNP in so many cases, for instance. I suppose the problem is caused
by the file with the contextual rules which came with the package.

Using the flag -i is said to give an intermediate file (which would
help tracing the way the rules have been applicated and thereby faci-
litate the modification of these or their order) but it doesn't work.

My question to you is whether you know of any better version of these
files (publicly available or "lendable" in exchange of my reference to
whomever the credits are due) which give a more decent result.

Time is a little too scarce for trainig the tagger myself. Any files
which perform better than those in this training kit would be welcome!

The languages of interest are ENG, FRE, and GER.

Sorry for bothering you
with this "novice" question

Johan Hagman
johan.hagman@jrc.it
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -