Re: [Corpora-List] Irish language corpora

From: Ciarán Ó Duibhín (ciaran@oduibhin.freeserve.co.uk)
Date: Sun Nov 26 2006 - 00:08:42 MET

  • Next message: Federica Barbieri: "[Corpora-List] Looking for a XML to TEXT convertor/editor"

    Ronan Fitzgerald wrote:

    > I am looking for a corpus of Irish language for some research, but all I
    > seem to be able to find are corpora based on literary texts, predominantly
    > dated from before the 20th Century. For my research purposes, I need a
    > corpus that contains terminology that is as contemporary as possible.

    Interesting, Ronan. I wonder if your research will bear out my experience
    that contemporary Irish (say, Irish from the last quarter of the 20th
    century), being written overwhelmingly by people belonging to a tradition
    whose first language is English, is heavily based on English semantics, and
    is substantially different (lexicographically, in particular) from what may
    be called "continuity Irish", as transmitted - in three main dialects - by
    native speakers of Irish.

    I have about 3M words of literary continuity Irish from the first half of
    the 20th century (see http://www.smo.uhi.ac.uk/~oduibhin/tobar/index.htm for
    some information) and as you say there are some other corpora of this sort.
    But the differences between this and non-continuity Irish may well not be an
    aspect with which you will be concerned.

    On computing terminology in Irish, I offer some thoughts in the light of the
    continuity/non-continuity divide in
    http://www.smo.uhi.ac.uk/~oduibhin/tearmai/index.htm.

    I hope this is some help, and in any case, good luck with your work.

    Ciarán Ó Duibhín.



    This archive was generated by hypermail 2b29 : Sat Nov 25 2006 - 23:10:39 MET