> Within the past month or so I posted two queries to CORPORA, one dealing
> with creating a tagger using relational databases, and the other with
> the frequency of phrasal verbs in English. I received a number of
> replies, which are found at the following URL:
>
> http://davies-linguistics.byu.edu/responses/corpora1.htm
As a last minute response to the question of phrasal verb frequency, I have
compiled a list of verb particles extracted out of the written portion of the
BNC, along with frequencies and valence counts for each. The list is
downloadable from:
mwe.stanford.edu/resources/
along with a link to a slightly outdated description of the extraction
technique used. The frequencies are almost certainly underestimates, as I was
more interested in recall than precision when I put this data together, and
the valence judgements should be taken with a pinch of salt. By way of note,
this is the data reported in:
Villavicencio, Aline (2003) Verb-Particle Constructions and Lexical Resources,
In Proceedings of the ACL-2003 Workshop on Multiword Expressions: Analysis,
Acquisition and Treatment, Sapporo, Japan.
Baldwin, Timothy, Colin Bannard, Takaaki Tanaka and Dominic Widdows (2003) An
Empirical Model of Multiword Expression Decomposability, In Proceedings of the
ACL-2003 Workshop on Multiword Expressions: Analysis, Acquisition and
Treatment, Sapporo, Japan, pp. 89-96.
Colin Bannard, Timothy Baldwin and Alex Lascarides (2003) A Statistical
Approach to the Semantics of Verb-Particles, In Proceedings of the ACL-2003
Workshop on Multiword Expressions: Analysis, Acquisition and Treatment,
Sapporo, Japan, pp. 65-72.
I hope this helps,
Tim
This archive was generated by hypermail 2b29 : Wed Oct 29 2003 - 23:33:09 MET