Re: [Corpora-List] Re: Minor(ity) Language (was: 'Standard European English' )

From: Chantal ENGUEHARD (Chantal.Enguehard@univ-nantes.fr)
Date: Wed Mar 08 2006 - 17:23:25 MET

  • Next message: Cédrick Fairon: "[Corpora-List] TALN 2006 : Call for Participation"

    I use the term "under-resourced" language to name the languages having a few
    linguist resources (dictionnaries, grammars) and also languages that are
    poorly supported by computers.
    The number of speaker is not at all taken in account.

    I get the impression that a lot of different terms are appearing that do not
    designate exactly the same concepts. But they are often confused because some
    languages below at the same time to differennt categories of languages.
    For instance an "endangered language" can be als a "minor languagae" and an
    "under-resourced langague"

    Chantal Enguehard (please, excuse my poor english)

    Note : [In 2004, vincent Berment defined in his thesis* an evaluation grid to
    note precisely what is the degree of computerization of any language. This
    grid allow to calculate a number (a note on a scale of 20 points).
    If this number is less than 10 points, the language is said to be a
    pi-language (pi being the greek letter p).
    If this number is more than 14 points, the language is said to be a
    tau-language (tau being the greek letter t).
    Otherwise the language is said to be a mu-language (mu being the greek letter
    m).]

    * Vincent Berment, "Méthodes pour informatiser des langues et des groupes de
    langues “peu dotées”", thèse de doctorat, GETA, Laboratoire CLIPS, IMAG,
    Université Joseph Fourier, 18 mai 2004.

    Chantal ENGUEHARD
    LINA
    2, rue de la Houssinière
    BP 92208
    44322 Nantes Cedex 03
    France



    This archive was generated by hypermail 2b29 : Wed Mar 08 2006 - 17:47:32 MET