Re: [Corpora-List] Query about nomenclature

From: Chris Brew (cbrew@acm.org)
Date: Sun Mar 06 2005 - 18:26:44 MET

  • Next message: Somers, Harold: "RE: [Corpora-List] Query about nomenclature"

    On Sun, Mar 06, 2005 at 02:56:29PM -0000, John Mckenny wrote:
    >
    > Dear CORPORA subscribers
    > Should it be N-gram/Ngram/n-gram/ngram? Is there a consensus about
    > which of these four to use? Is there a way to measure usage on the Web
    > via [1]www.webcorp.org.uk or other meta-engines?
    >
    > What comes after bigrams, trigrams? Is it 4grams, 5grams etc. or
    > could it be something like quadrigrams, pentagrams, hexagrams? It was
    > pointed out to me that pentagram is a Satanist symbol.

    The sequence could have been

    monogram, digram, trigram, tetragram, pentagram, hexagram, ...

    with fairly uniform (Greek) etymology, but someone chose

    unigram,bigram,trigram,...

    these look like Latin numerical prefixes, so my guess is that
    the intended extrapolation is

    quadrigram,quintagram,....

    which replicates the mixed Latin/Greek etymology of bigram through
    the series. Pretty yukky...

    Chris



    This archive was generated by hypermail 2b29 : Sun Mar 06 2005 - 19:11:06 MET