RE: [Corpora-List] English-language paraphrase corpora

From: Jayeeta Banerjee (Jayeeta.Banerjee@uce.ac.uk)
Date: Sun Feb 06 2005 - 20:31:00 MET

Next message: Claudia Sassen: "[Corpora-List] Final CfP: Constraints in Discourse"

Previous message: Hinrich Schuetze: "[Corpora-List] research position at the Institute for NLP, Stuttgart, Germany"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Olga,

We have developed an automated system called SHARES (System of Hypermatrix Analysis, Retrieval, Evaluation and Summarisation) that clusters related documents in a general news corpus and ranks them in order of similarity, by a method developed out of past work, including lexical cohesion. SHARES identifies topic at a series of levels, and is more linguistically refined in its approach than some other systems. There is a small demo online at www.rdues.uce.ac.uk/sharesguide/ and a user guide. Such clustering techniques can be used on a pair of general corpora from different sources to extract the kind of sets you are interested in.

Yours

Jay

Jay Banerjee

Research and Development Unit for English Studies

University of Central England, Birmingham

http://www.rdues.uce.ac.uk

        -----Original Message-----
        From: owner-corpora@lists.uib.no on behalf of Olga Shaumyan
        Sent: Mon 31/01/2005 23:06
        To: corpora@uib.no
        Cc:
        Subject: [Corpora-List] English-language paraphrase corpora



        Dear All,

        I am looking for English-language "comparable" corpora. I.e. I want,
        e.g., 2 collections of articles from different sources describing same events.

        Alternatively, would anyone know off-hand how one would go about
        constructing such comparable collections?

        (This is to be used for automatic paraphrasing.)

        Any pointers greatly appreciated,

        Olga
        University of Sussex NLP group

Next message: Claudia Sassen: "[Corpora-List] Final CfP: Constraints in Discourse"
Previous message: Hinrich Schuetze: "[Corpora-List] research position at the Institute for NLP, Stuttgart, Germany"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Sun Feb 06 2005 - 20:56:45 MET