Has anyone seen any work on reducing dictionary-style definitions to
simple(r) glosses?
For example, the definition
act or process of shrinking, esp in wood; shrinkage.
might reduce to 'shrinkage', and
bother; disturbance or interruption.
might similarly reduce to any one of the three content words. In some
cases, more than one word might be output:
to carry a canoe
should probably reduce to 'carry canoe', not just 'carry' or 'canoe'.
I can think of some heuristics, e.g. choose the least common word (in some
sense of 'common'), but if the chosen word is the object of a verb, retain
the verb also. (Which requires some parsing--fortunately, verbs in English
definitions are usually preceded by the word 'to', I suspect, so
distinguishing verbs from nouns should not be all that difficult.)
I suppose this may be related to text summarization work.
Mike Maxwell
LDC
maxwell@ldc.upenn.edu
This archive was generated by hypermail 2b29 : Mon Dec 08 2003 - 21:19:14 MET