Jerry,
if you want to evaluate yor morpological analyser,
why not take a look at the MorphoChallenge website
http://www.cis.hut.fi/morphochallenge2005/
This was a PASCAL network challenge to devleop morphological analysers
for Englsih, Finnish, Turkish, and the website has a standard
evaluation set, and a perlscript to compare your results against this
"gold standard" - so you can directly compare your
precison/recall/F-score agianst other contestants (see results section)
Eric Atwell, Leeds University
PS for my attempt to CHEAT in the MorphoChallenge, listen to
http://www.cis.hut.fi/morphochallenge2005/AtwellKurimo.ppt :-)
On Mon, 3 Jul 2006, j_kurjian@hotmail.com wrote:
> Hi all,
> I was wondering if anyone had a revised celex list, in particular a revised
> list of the celex words split by morpheme. I was planning to use celex as a
> gold standard to test my morphological analyzer. However, when I extracted
> the celex words split by morpheme, I found there were many cases that seem
> inappropriate for my purpose, e.g.
> wrongheadedness --> wrongheaded-ness
> vs. what I'd like: wrong+head+ed+ness
> wistful --> wistful
> vs. wist+ful
> whitening --> whitening
> vs. white+n+ing or whit+en+ing
>
> Thanks!
> Jerry
>
>
>
-- Eric Atwell, Senior Lecturer, Language research group, School of Computing, Faculty of Engineering, University of Leeds, LEEDS LS2 9JT, England TEL: +44-113-3435430 FAX: +44-113-3435468 http://www.comp.leeds.ac.uk/eric
This archive was generated by hypermail 2b29 : Fri Jul 07 2006 - 17:54:06 MET DST