Re: Corpora:English weather corpus

From: Steven Krauwer (Steven.Krauwer@let.uu.nl)
Date: Fri May 17 2002 - 16:53:03 MET DST

  • Next message: Eric Akkerman: "Corpora: WordSmith and tag attributes"

    > Katia Kermanidis wrote:
    >
    > Dear all,
    >
    > Could anyone please infrom me if there is an English corpus of
    > weather domain context such as forecasts, announces etc?

    If you are looking for a clean, annotated corpus I'm afraid
    I can't help you, but what I have on offer is a collection
    of (hourly downloaded) Irish TV teletext weather forecasts
    from October 1996 onwards. Straight, plain ascii downloads,
    including all the errors, typos, etc, but pretty easy to
    process.

    Some 20 million word tokens and some different 5400 word forms
    (isn't it amazing that you need so little to describe all weather
    conditions over a period of 6 years?).

    Just have a look at http://www-sk.let.uu.nl/mydocs/ttpage.htm
    and contact me if you are interested or need more info.

    Steven

    ______________________________________________________________________
    Steven Krauwer, ELSNET / UiL OTS, Trans 10, 3512 JK Utrecht, Nederland
    phone: +31 30 2536050, fax: +31 30 2536000, email: s.krauwer@let.uu.nl
                        http://www-sk.let.uu.nl



    This archive was generated by hypermail 2b29 : Fri May 17 2002 - 17:14:33 MET DST