Re: [Corpora-List] Web pages corpus

From: Jakob Halskov (jh.id@cbs.dk)
Date: Mon Mar 06 2006 - 13:21:01 MET

  • Next message: Chris Jordan: "Re: [Corpora-List] Web pages corpus"

    Dear Imen,

    It is very easy to compile a web corpus on your own using one of the freely available web search APIs. See for example:

    http://developer.yahoo.net/search/index.html

    or

    http://www.google.com/apis/

    Best regards,

    Jakob Halskov

    --
    PhD student
    Dept. of Computational Linguistics
    Copenhagen Business School
    www.id.cbs.dk
    

    ----- Original Message ----- From: "ismi.touati" <ismi.touati@laposte.net> Date: Monday, March 6, 2006 12:29 pm Subject: [Corpora-List] Web pages corpus

    > Dear all, > > I'm working on automatic summarization of web pages, i'm looking > for a corpus of web > > pages (html documents) with their abstract to evaluate my system. > > Does anyone knows if such a corpus exists? > > Thanks in advance for the help. > Imen. > > *********************************** > Imen Touati > Master Student at Faculty of Economic Science and management of > sfax, > Tunisia. > LARIS laboratory > Addresse : LARIS, FSEGS, BP 1088, 3018 Sfax, Tunisia > > Accédez au courrier électronique de La Poste : www.laposte.net ; > 3615 LAPOSTENET (0,34 ?/mn) ; tél : 08 92 68 13 50 (0,34?/mn) > > >



    This archive was generated by hypermail 2b29 : Mon Mar 06 2006 - 15:20:57 MET