Corpora: US company claims patent

Nenad Koncar (N.Koncar@tranexp.com)
Thu, 16 Oct 1997 21:38:36 +0100

It has been interesting to read about the US company’s patent claim and
about the different opinions and suggestions on the matter.

If I may, I would like to add to the discussion an additional
point of view.

Any Interlingua (or artificial language) approach to MT can work
but only to a certain level of accuracy which is far from what
people expect from a good MT system. The reason for this is that
is it close to impossible to build an Interlingua which can preserve
the finest nuances of any source language text in any language which
might be lost during the translation from the source language into
Interlingua
and from Interlingua into the target language. Any additional
level of transformation inevitably increases the probability of losing
valuable information that reduces the quality of the translated text.
On the other hand, any additional processing that might be used in trying
to extract the finest nuances of meaning from the source language text that
are to be kept in the Interlingua or artificial language lead to the
path of Knowledge Based MT that was thoroughly researched at
Carnegie Mellon University in the USA [Goodman & Nirenberg 1991] [Nirenberg
et al 1992] and their conclusions were that it is all very nice but at the
end of the
day does not produce a viable MT system. I say inevitably leads to the path
of Knowledge Based MT because you need some kind of knowledge base (or
Interlingua
or artificial language) in which to store all the information that is
extracted
from any source text in any language. This universal language must
by definition be:

a) expressive enough to encompass ALL that ALL known languages express and

b) it must be possible to transform any sentence from any language to/from
this universal language without any loss of information for it to be
useful.

Neither a) nor b) are truly satisfied in what I have seen thus far
from the research done in this area and therefore it cannot produce
good quality translations on any text in any language (it might be
fine for some language pairs on some texts - beware of prepared texts!).

What I am saying here is something that is obvious to multi-lingual
speakers but might seem to be an effort of over complicating things
to a monolingual speaker or even to a bi-lingual speaker. Natural
languages simply have such richness and especially when you are looking
at languages that come from completely different roots (e.g. Japanese
and Russian) that it is more than a formidable task to try to
formulate an Interlingua or artificial language that captures ALL
the finest nuances of ALL languages. One can easily get together a
toy Interlingua system but not one that really works on any text in
any language.

It is a nice idea that would save huge amounts of effort to those who
wish to have MT systems that translate from any language to any other
language with the minimum of time and effort invested into its creation.
If there is a research group or company that has succeeded in creating a
viable Interlingua or artificial language based MT system then I would
like to congratulate them and I would like to see it with my own two eyes
working on any text in any language.

In practice shallow analysis and transfer MT systems such as SYSTRAN
and others have captured the business world simply because they can
be fine tuned through years of use to produce acceptable translations
with an acceptable level of work on a given language pair.


References
----------

[Goodman & Nirenberg 1991] Edited by K. Goodman and S. Nirenburg.
The KBMT Project: A case study in Knowledge-Based Machine Translation.
Morgan Kaufmann Publishers, San Mateo, California, 1991.

[Nirenberg et al 1992] S. Nirenburg, J. Carbonell, M. Tomita, K. Goodman.
Machine Translation: A Knowledge-Based Approach. Morgan Kaufmann Publishers,
San Mateo, California, 1992.

Nenad Koncar

-------------------------------------------------------------
Dr. Nenad Koncar Translation Experts Ltd.
N.Koncar@tranexp.com http://www.tranexp.com