[Corpora-List] Re: Chi-Square

From: FIDELHOLTZ_DOOCHIN_JAMES_LAWRENCE (jfidel@siu.buap.mx)
Date: Sun Sep 17 2006 - 15:00:44 MET DST

  • Next message: Adam Kilgarriff: "RE: [Corpora-List] Chi-Square"

    Hi, Crayton,

    I'm no expert in collocations, but obviously in testing their significance,
    it is on the basis of a fairly large corpus, where many of the cells will
    have well over 100 occurrences. This is over the limit where chi squared
    gives useful results. This is basically because the formula for chi squared
    involves (oversimplifying a bit) a quantity squared (thus the name) divided
    by a quantity that tends to increase more nearly linearly. To put it
    simply, where the cells contain numbers much over 100, you are virtually
    guaranteed that chi squared will produce 'significant' results (usually
    defined as p < .05; that is, the probability of the table *not* indicating a
    significant result is less than one in twenty). Obviously, this makes this
    particular test of very little use, since almost everything you test for
    under those circumstances comes out 'significant'. Other, more
    sophisticated, statistical tests tend not to be affected by large numbers in
    the cells, in the sense of becoming more likely to produce 'significance',
    and therefore are more suitable for calculating significance in situations
    where the numbers are large.

    We hear a lot about chi squared because it is a relatively easy test to
    apply, and if the numbers are lowish (under 100) but not too low (over 4 or
    5), the test usually gives sensible results.

    Jim

    Crayton Walker escribió:

    > A simple question about statistical measures.
    >
    > Could someone explain in very simple terms why we don't normally use
    > Chi-square as a measure of collocational significance? We tend to use
    > t-score and MI and not Chi-square. Why not? I am not a mathematician so
    > would appreciate it if you could keep it simple.
    >
    > Many thanks
    >
    > Crayton Walker
    >
    > University of Birmingham
     

    James L. Fidelholtz
    Posgrado en Ciencias del Lenguaje, ICSyH
    Benemérita Universidad Autónoma de Puebla MÉXICO



    This archive was generated by hypermail 2b29 : Sun Sep 17 2006 - 19:16:53 MET DST