RE: [Corpora-List] sentence boundary detectors

From: Scott Songlin Piao (scott.piao@manchester.ac.uk)
Date: Mon Feb 19 2007 - 13:31:40 MET

  • Next message: jasper holmes: "Re: [Corpora-List] Reference corpus of Academic English"

    Hi Armin,

    I put my English sentence splitor on the website:
    http://text0.mib.man.ac.uk:8080/sentencebreaker/heuristic_tool

    It is rule-based Java program and is downloadable.

    Cheers

    Scott Piao
    ----------------------------
    Text Mining
    School of Computer Science
    The University of Manchester
    UK

    -----Original Message-----
    From: owner-corpora@lists.uib.no [mailto:owner-corpora@lists.uib.no] On Behalf Of Armin Schmidt
    Sent: 17 February 2007 19:48
    To: corpora@uib.no
    Subject: [Corpora-List] sentence boundary detectors

    Dear list,

    I was wondering if you could point me to good sentence splitters for the
    following languages: German, Russian, Spanish, English. It would be
    great if they were stand-alone programs or modules for Python (Perl
    would be ok, too ... although I'm already aware of the respective
    CPAN-modules for English and German).

    Since I do have corpora in all the above mentioned languages I would
    also be very interested in available implementations (not papers) of any
    unsupervised learning methods for detecting sentence boundaries (or
    rather abbreviations).

    Thanks,
    Armin



    This archive was generated by hypermail 2b29 : Mon Feb 19 2007 - 13:29:10 MET