> Chi-square is appropriate _if_ there is enough data about the
> events which you are interested in. Suppose that the base rate for
> an event e1 is p(e1), that the base rate for e2 is p(e2) and that you
> have a sample of N events. Chi-square is usually OK if all of
>
> p(e1)*p(e2)*N > 5
> (1-p(e1))*p(e2)*N > 5
> p(e1)*(1-p(e2))*N > 5
> (1-p(e1))*(1-p(e2))*N > 5
>
> In your example you are looking at all pairs of a form of "bring" and its
> object.
> e1 is the observation that the verb form is passive, e2 is the event that the
> object is "charges", N is the number of pairs involving any form of "bring"
> and any object. If you have 1000 pairs involving "brought" of which 300 are
> passive voice p(e1) is 0.3 (1-p(e1)) 0.7. You might have 100 instances of
> "charges" as the object and 900 of some other object. Than p(e2) is 0.9
> and the expected cell counts are
I hope I don't sound horribly picky, but you mean p(e2) is 0.1 and (1-p(e2))
is 0.9. Or don't you ?
Marco Rocha
Marco A E Rocha
University of Sussex
School of Cognitive and Computing Sciences
Falmer, Brighton
BN1 9QH - U.K.
tel.: +44 +01273 678052
fax: +44 +01273 671320
e-mail: marco@cogs.susx.ac.uk