[Corpora-List] Re: celex plus - evaluating morphological analyser

From: Eric Atwell (eric@comp.leeds.ac.uk)
Date: Fri Jul 07 2006 - 17:54:18 MET DST

  • Next message: Ciarán Ó Duibhín: "Re: [Corpora-List] Encoding of apostrophes and quotes"

    Jerry,

    if you want to evaluate yor morpological analyser,
    why not take a look at the MorphoChallenge website
    http://www.cis.hut.fi/morphochallenge2005/

    This was a PASCAL network challenge to devleop morphological analysers
    for Englsih, Finnish, Turkish, and the website has a standard
    evaluation set, and a perlscript to compare your results against this
    "gold standard" - so you can directly compare your
    precison/recall/F-score agianst other contestants (see results section)

    Eric Atwell, Leeds University

    PS for my attempt to CHEAT in the MorphoChallenge, listen to
    http://www.cis.hut.fi/morphochallenge2005/AtwellKurimo.ppt :-)

    On Mon, 3 Jul 2006, j_kurjian@hotmail.com wrote:

    > Hi all,
    > I was wondering if anyone had a revised celex list, in particular a revised
    > list of the celex words split by morpheme. I was planning to use celex as a
    > gold standard to test my morphological analyzer. However, when I extracted
    > the celex words split by morpheme, I found there were many cases that seem
    > inappropriate for my purpose, e.g.
    > wrongheadedness --> wrongheaded-ness
    > vs. what I'd like: wrong+head+ed+ness
    > wistful --> wistful
    > vs. wist+ful
    > whitening --> whitening
    > vs. white+n+ing or whit+en+ing
    >
    > Thanks!
    > Jerry
    >
    >
    >

    -- 
    Eric Atwell, Senior Lecturer, Language research group, School of Computing,
    Faculty of Engineering, University of Leeds, LEEDS LS2 9JT, England
    TEL: +44-113-3435430  FAX: +44-113-3435468  http://www.comp.leeds.ac.uk/eric
    



    This archive was generated by hypermail 2b29 : Fri Jul 07 2006 - 17:54:06 MET DST