Re: [Corpora-List] Dice coefficient

From: Bengt Dahlqvist (bengt.dahlqvist@ling.uu.se)
Date: Mon Apr 24 2006 - 15:17:24 MET DST

  • Next message: Rayson, Paul: "[Corpora-List] 2nd CFP: Workshop on Chinese Multi-Word Expressions (MWE) and Machine Translation"

    At 09:50 2006-04-19, Markus Saers wrote:
    >The only definition of the Dice coefficient that I have seen looks like this:
    >
    >Dice = 2 * p(ws, wt) / ( p(ws) + p(wt) )

    The Dice index can also be computed from a 2x2 contingency table:
           x=1 0
    y=1 a b
       0 c d

    Here the Dice (1945) is defined as = 2*a / (2*a + b + c)
    This computation can be easier to perform in certain cases.

    In the same manner, the Jaccard (1908) index is defined as
    Jaccard = a/(a+b+c)

    and e.g. the Ochiai (1957) index = a / (sqrt(a+b) * sqrt(a+c))

    The literature gives a wealth of other indices as well.

    --
    /Bengt Dahlqvist
    



    This archive was generated by hypermail 2b29 : Mon Apr 24 2006 - 15:16:58 MET DST