Corpora: Statistics for multiword units

Su'ad Awab (s.awab@lancaster.ac.uk)
Sun, 21 Mar 1999 13:57:19 +0000 (GMT)

I'm looking for a statistic(s) to test the significance of multi word
units in ONE corpus, not as comparison with another set. I am aware of the
log-likelihood formulae for bigrams (Dunning 1996) but I am looking at
other n-grams as well.

Any form of help (formulas, links to literature etc) is highly
appreciated.

Su'ad
------------------------------------------------------------------------

Su'ad Awab
Dept of Linguistics
Lancaster University
Lancaster LA1 4YT
England, UK.

Email: s.awab@lancaster.ac.uk
-------------------------------------------------------------------------