RE: [Corpora-List] Irish language corpora

From: Adam Kilgarriff (adam@lexmasterclass.com)
Date: Sat Nov 25 2006 - 08:40:42 MET

  • Next message: santinim\@inwind\.it: "[Corpora-List] About 'user warrant'"

    Ronan,

    The NCI (New Corpus for Ireland) contains 30M words of Irish from a wide
    range of sources, with an emphasis on contemporary language. It was
    developed by our company, Lexicography MasterClass, and commissioned by
    Foras na Gaeilge (FnG, the "Board for the Irish Language") to support the
    development of a new English-Irish dictionary. I have forwarded your
    enquiry to FnG, who own the data, so decide who to give access to.

    I'm just proofreading a paper on it - let me know if you want a copy

    Adam Kilgarriff
    Lexicography MasterClass Ltd
    Lexical Computing Ltd
    University of Sussex

    -----Original Message-----
    From: owner-corpora@lists.uib.no [mailto:owner-corpora@lists.uib.no] On
    Behalf Of Mike Maxwell
    Sent: 24 November 2006 21:31
    To: CORPORA@UIB.NO
    Subject: Re: [Corpora-List] Irish language corpora

    fitzgerr@aston.ac.uk wrote:
    > I am looking for a corpus of Irish language for some research, but all I
    > seem to be able to find are corpora based on literary texts, predominantly
    > dated from before the 20th Century. For my research purposes, I need a
    > corpus that contains terminology that is as contemporary as possible.

    I presume you've looked at the NCI (Nation Corpus for Ireland), and that
    it doesn't meet your needs.

    Have you looked at Keven Scannel's collection
    (http://borel.slu.edu/crubadan/index.html)? Looks like he has a 25M
    word corpus of Irish, which I believe he collected entirely off the web.

    -- 
    	Mike Maxwell
    	maxwell@ldc.upenn.edu
    



    This archive was generated by hypermail 2b29 : Sat Nov 25 2006 - 08:38:34 MET