Re: Corpora: Ergo's Parsing Contest

Ted E. Dunning (ted@aptex.com)
Mon, 22 Feb 1999 13:41:20 -0800 (PST)

Mr. Bralich,

The early MUC competitions (MUC-2, which dealt with telegraphic
situation reports, in particular) have dealt with language which is
far from the newspaper-like text which later MUC's have analyzed.

The primary motives for this shift have been two-fold:

a) first and foremost, the people with the money thought that the
automatec analysis of newswire and similar text was of great
importance.

b) also quite important, newswire is very available and is
unclassified. Both of these properties are quite important.

As far as control applications are concerned, the ATIS experiments and
related efforts were quite informative and were well designed.
Notable characteristics of the original ATIS efforts which are lacking
from your (mr. Bralich's) "contest" include

a) the ATIS task was designed in consultation with the researchers and
the ultimate consumers of the technology.

b) the data and tasks were based on an analysis of a real-world
problem rather than invented to suit a pre-existing set of software.

c) the evaluation was objective rather than conducted by a small,
self-appointed, obviously biased and highly bombastic group.

Finally, lack of interest in your "contest" does not imply lack of
confidence on the part of members of the NLP community. Most of the
researchers you are trying to taunt do not plan their research efforts
to try to impress you. My guess based on knowing a number of them is
that most of them have long since out-grown this sort of double-dare
and too-scared-to-try sort of rhetoric if not by the end of third
grade, then at least by the end of their post-graduate studies.

The fact is, the deafening lack of response that you are noting is an
indication that nobody feels compelled to provide you with free
consulting efforts and education in return for uninformed slander.

pb> The only problem with the contests you mention is that they
pb> are based on searching huge corpii of unrestricted text
pb> which of coures is a valid area of research but which
pb> does nothing for the more immediate problems of improving
pb> navigation and control devices, q&a dialoging and so
pb> forth. The existence of these contests also does not
pb> explain why you are incapable of handling the far simpler
pb> sentences that are in the Ergo Contest. The MUC and
pb> TREC contests require huge dictionaries to work with
pb> language such as found in the New York Times and Wall
pb> Street Journal but does not contribute a bit to the
pb> other areas of NLP that are quite primitive yet need
pb> further research.

...

pb> And if you actually had the tools you are talking about you
pb> could and would put me in my place by displaying their
pb> effectiveness contrary to ours. Don't think you can fool
pb> me or anyone else with charges of unprofessionalism.

pb> ... show me once and for all the error of my ways.