Corpora: Available multi-tagging software.

Llums Padr (padro@lsi.upc.es)
Thu, 12 Feb 1998 12:44:08 +0100

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Dr Tony Berber Sardinha: "Corpora: Portuguese corpus linguistics"
Previous message: j.m.b.johannessen@ilf.uio.no: "Corpora: New: The Oslo Corpus of Bosnian Texts"

During the last years I developed a disambiguating
software based on relaxation labelling algorithm,
and used it in my PhD thesis.

The last version is able to disambiguate texts on
several dimensions (POS, sense, syntax, ...) either
separately or simultaneously, when provided
appropriate language models.
The used language models are based on CG, allowing
a numerical value associated to each constraints and
the merging of linguist-written with statistical language
models.

The source is written in C, lex, and yacc. It may
use WN if available on your system to perform
sense disambiguation. It works in UNIX/LINUX, but
it should work in DOS based with minor problems.

The software is packaged with usage examples and
a brief manual. Anyway, any question, observation,
suggestion, (or bug you might find) will be welcome.

You can get it all from my web page,
http://www.lsi.upc.es/~padro by clicking on the
"research" icon. There you can get also my thesis
and other related publications. You can also get
the papers and the thesis from
http://xxx.lanl.gov/abs/cmp-lg

Lluis Padro

Next message: Dr Tony Berber Sardinha: "Corpora: Portuguese corpus linguistics"
Previous message: j.m.b.johannessen@ilf.uio.no: "Corpora: New: The Oslo Corpus of Bosnian Texts"