Re: [Corpora-List] comparing two IR systems using statistical tests?

From: Mark Sanderson (m.sanderson@sheffield.ac.uk)
Date: Tue Oct 11 2005 - 20:24:46 MET DST

  • Next message: Markus Heller: "Re: [Corpora-List] spoken German"

    For statistical tests, I suggest you use a paired
    two tailed t-test comparing the rankings with
    Mean Average Precision or with Precision measured
    at rank 10. Other researchers use the Wilcoxon test, which is also a good test.

    I co-wrote a paper on statistical tests which
    appeared in this summer's SIGIR, you can get it from here

             http://dis.shef.ac.uk/mark/cv/publications/papers/my_papers/SIGIR2005.pdf

    At 15:09 11/10/2005, Chris Jordan wrote:
    >Not to plug my own thesis work or anything :P
    >
    >I have been exploring the usage of automatic
    >query generation based on relative entropy to
    >manufacture controlled environments for
    >evaluating retrieval algorithms. I have paper I
    >am preparing for ECIR with futher controlled
    >environments that I have been playing with that
    >has resulted in some interesting findings. I can
    >pass that along to you if it gets accepted
    >(knock on wood) but for now I can give you ready access to a pdf of my thesis.
    >
    >
    >Timad Kahena wrote:
    >
    >>Hi,
    >>Based on two (or more) results list (in TREC
    >>formats), how can we compare two IR systems
    >>using statistical tests? Could someone show me how to do that?
    >>Examples and tutorial links are welcome…
    >>
    >>Thank you,
    >>Timad
    >>
    >>
    >>
    >>---------------------------------
    >>Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger
    >>Téléchargez le ici !
    >>
    >>
    >>
    >
    >
    >

    http://nlp.cs.nyu.edu/hlt-naacl06/

    ____________________________________________________________________
    Mark Sanderson, Room 303 Tel: +44 (0) 114 22 22648
    Department of Information Studies Fax: +44 (0) 114 27 80300
    University of Sheffield, Regent Court, mailto:m.sanderson@shef.ac.uk
    Portobello St, Sheffield, S1 4DP, UK http://dis.shef.ac.uk/mark/
    ____________________________________________________________________
    Good judgement is from experience, experience is from bad judgement



    This archive was generated by hypermail 2b29 : Tue Oct 11 2005 - 21:20:05 MET DST