Corpora: Summary: English Verb Valency Question

(no name) ((no email))
Wed, 2 Sep 1998 16:16:28 +0300

Thanks to everyone who answered my question.
----------------------------------------------------------------------------
Dear All

Can You tell me something about Electronic Dictionaries where I can find the
most complete information about English Verb Valency? I would really like if
You point me a direction where to search.
----------------------------------------------------------------------------
----------------------------------------------------------------------------

Hi,

A good source to look at is the CELEX lexical data base (available from LDC
or ELRA).
Contains 52000 lemmata with many morphological & syntactic features.

By the way, what kind of organization/company is 'invention-machine'??

Luc Mortier.

---------------------------------------------------------------------------
---------------------------------------------------------------------------
You may get very similar answers from everyone, but here's one quick
version:

1. FrameNet (Charles Fillmore et al - UC Berkeley): encoding verbs (and
other POS) according to valency (actually Case) frames. This looks like it
will be very useful when it comes out, but it is still in the development
phase. They expect to release about 1000 verbal entries by the end of the
year, so that people in the NLP community can begin to play with them.
(See http://www.icsi.berkeley.edu/~framenet/index.html for details.)

2. COMLEX corpus (distributed by the Linguistic Data Consortium - LDC):
contains detailed syntactic information, and is available now. Of course,
there is a cost involved if your organization is not a member of the LDC.
For general information on COMLEX, see
http://cs.nyu.edu/cs/projects/proteus/comlex/. For LDC COMLEX pricing and
distribution information, see
http://www.ldc.upenn.edu/ldc/catalog/html/lexical_html/comlexesv.html.

The FrameNet people are talking about providing links from FrameNet to both
COMLEX and to WordNet, but that of course won't help until FrameNet is
released.

Hope this provided you some initial direction for your search.

Keith J. Miller

----------------------------------------------------------------------------

--
----------------------------------------------------------------------------
--

Alexander,

in the 80's, many English NLP researches adopted the Longman Dictionary of Contemporary English (LDOCE) as an NLP lexical database, because it was designed for (human) English language learners, and many of these special design features were also thought useful for machine 'learning'. In particular, there was a very detailed word-class categorization scheme, particularly for verbs - separate categories D (ditransitive) I (intransitive) L (linking verb with complement), T (transitive verb with one object) V (verb with one object + verbform), X (verb with one object + something else), W (special cases: Wv1 = be, Wv2 = auxiliary v, Wv3 = no schwa in -ing form, Wv4 = used in -ing form as adj, Wv5 = used in =ed form as adj, Wv6 = not used in -ing form).

However, I gather that feedback from (human) English language learners was negative - thery found these detailed labels confusing, so Longman decided to simplify the grammatical categories in later editions. The electronic version of LDOCE we have is of the 1978 edition; I also have hardcopy version of the 3rd edition, 1995, but they have done away with the grammar-label details in this, leaving only the distinction between T (transitive) and I (intransitive).

You could try contacting Addison Wesley Longman (now Longman has been taken over!) direct to ask for an old 1978 version - presumably you want the electonic version? Cindy Leaney was the Electronic Dictionaries Publisher who gave me the 3rd Ed LDOCE back in 1996, try her on +44 1279 623623; or try the Director of ELT dictionaries, Della Summers, +44 01279 623463, della.summers@awl.co.uk - she gave me LDOCE 1978 version originally, and probably knows where to get one now (and how much it will cost you...!)

good luck,

Eric Atwell, Senior Lecturer in Artificial Intelligence, SOCRATES Coordinator, and Director, Centre for Computer Analysis of Language And Speech (CCALAS) School of Computer Studies, University of Leeds, LEEDS LS2 9JT, England EMAIL: eric@scs.leeds.ac.uk TEL: (44)113-2335761 FAX: (44)113-2335468 WWW: http://www.scs.leeds.ac.uk/scs/public/staff/eric.html

PS - I've just found http://www.awl-elt.com/dictionaries/resldoce1.html - read this for how to get 1978 LDOCE, see links for other Longman electronic dictionaries