Re: Corpora: T-score in collocational analysis

Tony Berber Sardinha (tony4@uol.com.br)
Fri, 10 Dec 1999 13:29:50 -0200

Hi

I canīt find

> Clear, J 1995, 'COBUILD Bank of English explanation of stats'. Collins
> COBUILD Collocation Concordancer

on

> http://titania.cobuild.collins.co.uk/form.html

Any other addresses where this might be available?

cheers
tony.
-------------------------------------
Dr Tony Berber Sardinha
Catholic University of Sao Paulo, Brazil
tony4@uol.com.br
http://sites.uol.com.br/tony4/homepage.html
http://homepages.infoseek.com/~corpuslinguistics/homepage.html

-----Mensagem Original-----
De: Gordon and Pam Cain <gpcain@rivernet.com.au>
Para: Przemyslaw Kaszubski <przemka@main.amu.edu.pl>; Corpora-L
<corpora@hd.uib.no>
Enviada em: Quinta-feira, 9 de Dezembro de 1999 08:38
Assunto: Re: Corpora: T-score in collocational analysis

> Przemyslaw--
>
> Przemyslaw Kaszubski wrote:
> >
> > Regards to to all the subscribers,
> >
> > Two questions:
> >
> > 1. Can anyone explain (or point to a Web source or otherwise easily
available source apart from the Church, K.W,, W. Gale, P. Hanks & D. Hindle
"Using Statistics in
> > Lexical Analysis" in <italic>Lexical Acquisition: Using On-Line
> > Resources to Build a Lexicon</italic>. Ed. Uri Zernik. Hillsdale:
> > Lawrence Erlbaum, 1991)
> > the use of the t-score statistic in collocation retrieval? I mean the
> > one used by Cobuild. How does the formula work? I am familiar with
> > MI and Z-scores but the t-score seems to be
> > in use only in the CobuildDirect service.
> >
>
> Try Jeremy Clear's explanation from the Cobuild site of the T-score (and
> the MI I think). The address I gave it in my biblio is:
>
> Clear, J 1995, 'COBUILD Bank of English explanation of stats'. Collins
> COBUILD Collocation Concordancer
> http://titania.cobuild.collins.co.uk/form.html
> (accessed 24th April, 1999).
> It's the most clear and accessible that I've found.
>
>
> Church and Hanks also wrote:
> Church, KW, and P Hanks 1990, 'Word association norms, mutual
> information, and lexicography', Computational Linguistics vol 16, no 1
> (March 1990), 22-29
>
>
> You might also try:
> Godby, J 1994(?), 'Two techniques for the identification of phrases in
> full text'
> http://www.oclc.org/oclc/research/publications/review94/part1/twotech.htm
> (Accessed 15th July, 1998).
>
> I don't remember much about it, but think it was related.
>
>
> > 2. Do you know of corpus analysis
> > packages available for researchers that employ this t-score?
>
> Am attaching part of a posting by Oliver Mason from earlier this year --
> I think it uses seven(!) different scores for collocations, and was
> developed by the Cobuild lot, so I'm sure it would offer the T-score!
>
> Oliver Mason wrote:
> . . .I am pleased to announce the release of a corpus browser called
> `Qwick', which is now available for download from our website at
>
> http://www.clg.bham.ac.uk/QWICK/index.html.
>
> Qwick allows you to
> construct a working corpus from a set of corpora available on the
> computer, retrieve concordance lines from this using a simple but
> powerful query language, and to compute collocations with a variety of
> adjustable parameter settings.
>
> Qwick is implemented in Java and thus is fully platform independent; it
> has been extensively tested on Windows and Solaris. . .
>
> >
> > I do small corpus research and I am basically after a tool with a statistic
that does not favour rare words as much as the MI does. So far TACT's z-scores
seem the best option.
> >
> > Przemek Kaszubski
> > ========================================== Przemyslaw Kaszubski, M.A.
przemka@amu.edu.pl http://elex.amu.edu.pl/ifa/staff/kaszubski.html
> > MY (ENGLISH) (LEARNER) CORPORA PAGE: http://main.amu.edu.pl/~przemka
> > School of English Adam Mickiewicz University Al. Niepodleglosci 4 61-874
Poznan, POLAND tel: +48 61 8528820 fax: +48 61 8523103
=========================================
>
> --
> Gordon Cain
> Teacher of ESOL
> TAFE International Education Centre
> Liverpool (Sydney) Australia
> gpcain@rivernet.com.au
>
>
>