RE: Corpora: Relatve text length

From: Damon Davison (allolex@SDF.LONESTAR.ORG)
Date: Thu Apr 25 2002 - 22:12:36 MET DST

  • Next message: geoffrey.williams: "Corpora: 2èmes Journées de la Linguistique de Corpus - Appel à communications"

    Linguists interested in comparative word length are most likely
    interested in *written* language. In fact, most corpus research is
    based on writing, since at our current state of technology, doing corpus
    research on speech is difficult. In one example of its usefulness,
    comparing word lengths across languages can provide a quick means of
    error-checking for machine translation output. It is also possible, to
    a certain extent, to characterize languages typologically by word
    length. It may be obvious, but agglutinating languages tend to have
    longer words.

    I don't think it would be wrong to say that linguistics is the study of
    language as a system. (Human beings seem to systematize things quite
    naturally.) Written language also belongs to the system of language.
    In fact, writing systems may even tell you more about the language than
    speech analysis, since written language often contains historical data
    that contributes to our understanding of current language use.

    Damon Davison

    -- 
    --
    Damon Allen Davison
    http://allolex.lonestar.org
    allolex@sdf.lonestar.org
    



    This archive was generated by hypermail 2b29 : Thu Apr 25 2002 - 23:06:46 MET DST