1. articles directly addressing the problem of _Machine Translation_ of
prepositional phrases - in particular, previous proposed solutions and/or
implementations/evaluations of those solutions.
2. articles relating to evaluation of the current state of the art in PP
translation. Both a) pointers to articles and b) suggestions for
handling the fact that there may not be numeric (hard, factual)
metrics that demonstrate that PP translation is an area in MT
that needs improvement would be appreciated.
I would like to thank all who responded:
Francis Bond bond@cslab.kecl.ntt.co.jp
Ted Dunning tdunning@aptex.com
David Farwell david@crl.nmsu.edu
Stephen Helmreich shelmrei@crl.nmsu.edu
John Hutchins J.Hutchins@uea.ac.uk
Marion Kee Marion_Kee@gotham.mt.cs.cmu.edu
Marc Picard picard@vax2.concordia.ca
Irina Reyero-Sans Irina.Reyero@bcn.gms.es
Deborah D K Ruuskanen druushan@cc.helsinki.fi
Ralf Steinberger ralf.steinberger@jrc.it
Arturo Trujillo ARTURO@fs1.ccl.umist.ac.uk
Their responses are incorporated into the summary below.
I. Articles addressing the Machine Translation of Prepositional Phrases
With respect to the first request above, as John Hutchins
notes "there are few treatments specifically devoted to
prepositional phrases in MT. However, this problem area will
have been treated in the context of valency, case grammar,
and general syntactic and semantic analysis -- particularly for
languages such as Japanese, German and Russian." I have
certainly found this to be the case. Nonetheless, he and several
others did uncover several references pertaining specifically
to the MT of PPs. These are listed below. In particular, two
Ph.D. theses have recently been completed that deal with
prepositions in machine translation:
Reyero-Sans (1998) and Trujillo (1995).
In addition to specific articles, I received a couple of
pointers to projects/departments:
There is work at the University of Helsinki on the translation
from Finnish case endings to English prepositions.
I was referred to the "Catalyst Project" at CMU, where there
was detailed empirical work done on the semantic categorization
of English PPs, starting from the discussion of PP usage in
Quirk et al., and the subsequent arrangement of the resultant
categories into a hierarchy. Similar analyses were done for the
target languages.
II. Evaluation of the current state of the are in PP translation
Unfortunately, with respect to the second request, the jury seems
still to be out. This is not surprising, given the lack of
agreement surrounding MT evaluation methodologies. Several people
did cite internal evaluations of systems, often stating that
performance on PPs was not broken out specifically. One respondent
gave a personal impression that the system correctly translates
75-80% of PPs, based on human inspection of the output.
As a way of handling the fact that there are no 'benchmark' numbers
in the literature for the translation of PPs, it was suggested
that Systran be used as the gold standard - comparing the rate of
correct translations of PPs by Systran to that of whatever
approach(es) I suggest in my work, in order to determine an upper
bound on the improvement expected based on my methods. It was
further suggested that I could perform a direct comparison of my
methods to available MT systems, using a human translation as the
standard. I am interested to hear any reactions (positive or
negative) to this methodology.
Thanks again to all who responded.
----- Keith J. Miller
(references included below sig.)
