Re: Corpora: training Brill's tagger with French

From: Keith J. Miller (keith@mitre.org)
Date: Wed Mar 14 2001 - 16:09:00 MET

  • Next message: Ahti Pietarinen: "Corpora: DEADLINE EXTENSION: Logic & Games ESSLLI'01 workshop"

    To add support to Jean Véronis' posting, I recently had dealings with both
    the INaLF folks and with the people at Synapse.

    INaLF was more than happy to share the version of Brill's tagger trained for
    French on the signing of a simple agreement. And the Cordial POS tagger is
    all that Jean says and more. It's almost a pity to simply call it POS
    tagger, because it also gives information about grammatical relationships,
    functions, etc. You can almost extract a parse from its output -- or most
    likely the parts of a parse that you're interested in using in further
    processing. Also, Synapse was receptive to reports of minor difficulties
    with the software, and requests for enhancements, some of which were
    implemented right away.

    Best of luck with your work.

                        ----- Keith J. Miller
                        keith@mitre.org
                        kjmiller@georgetown.edu

    ----- Original Message -----
    From: Jean Veronis <Jean.Veronis@newsup.univ-mrs.fr>
    To: Andre Linden <Andre.Linden@dimail.epfl.ch>; <CORPORA@HD.UIB.NO>
    Sent: Wednesday, March 14, 2001 3:59 AM
    Subject: Re: Corpora: training Brill's tagger with French

    At 09:47 13/03/2001 +0100, Andre Linden wrote:
    >Dear members,
    >
    > We are currently working with Brill's tagger on French texts. We
    >are facing the problem of training the tagger with accented texts and
    >would like to know if anyone already has encountered this problem. We
    >would very much appreciate any feedback on your own experience in this
    >regard.

    An adaptation to French has already been made and can be downloaded:

    http://jupiter.inalf.cnrs.fr/WinBrill/winbrill.bienvenue.html

    There is another tagger that more and more teams use in France, since it
    performes well (probably the best tagger at the moment), and does not
    require any training, hacking, etc. It is commercially distributed, but
    very cheap for research (I think less then USD 100). It is called Cordial
    Analyseur, and is developped by Synapse Development.

    Contact:

    http://www.synapse-fr.com/
    Mr. Dominique LAURENT <dlaurent@synapse-fr.com>

    Jean Véronis
    http://www.up.univ-mrs.fr/~veronis/



    This archive was generated by hypermail 2b29 : Wed Mar 14 2001 - 22:36:11 MET