Re: [Corpora-List] calculation problem

From: Marco Baroni (baroni@sslmit.unibo.it)
Date: Thu Oct 20 2005 - 19:20:41 MET DST

  • Next message: Juan Huerta: "Re: [Corpora-List] calculation problem"

    Dear Alexander,

    I'm a bit confused...

    > if you assume that occurences in your corpus are distributed uniformly
    > (actually the simplest probability distribution ever), you can take this 100
    > number
    >
    > Otherwise, if you use another distribution that better describes behaviour
    > of the occurences it will influence the number of occurences in the 1
    > million corpus and will be probably not 100.
    >

    Isn't the problem rather one of (non-random) sampling, and not a matter of
    the assumed distribution (which, as far as I can tell, is not assumed to be
    uniform)?

    Regards,

    Marco



    This archive was generated by hypermail 2b29 : Thu Oct 20 2005 - 19:30:46 MET DST