[Corpora-List] question about Wordsmith tools (log-likelihood)

From: Luciana Diniz (esllsdx@langate.gsu.edu)
Date: Wed Sep 20 2006 - 22:50:07 MET DST

  • Next message: fomi: "[Corpora-List] [Applied Ontology] CFC: Special issue on Formal Ontologies for Communicating Agents"

    Hello!

    I'm trying to make sense of the log likelihood formula (in the Wordsmith
    Tools manual), and I'm not sure what "d" means in:

    "d := frequency of pairs involving neither w1 nor w2"

    Does it mean the frequency of the all possible collocates (with span
    1:1) minus the frequency of the word 1 (isolated frequency) minus the
    frequency of word 2 (isolated frequency)?
    If this is the case, would "d" be very close to the total number of
    words in the corpus?

    Also, if this is the case, what if I choose a different span? Would this
    change the value of "d"?

    I'm very confused and I'd really appreciate it if somebody could help me
    :)

    Thank you!
    Luciana.



    This archive was generated by hypermail 2b29 : Wed Sep 20 2006 - 23:08:05 MET DST