Re: [Corpora-List] QM analogy and grammatical incompleteness

From: Rob Freeman (lists@chaoticlanguage.com)
Date: Tue Dec 20 2005 - 04:31:16 MET

  • Next message: John F. Sowa: "Re: [Corpora-List] QM analogy and grammatical incompleteness"

    On Tuesday 20 December 2005 15:13, Dominic Widdows wrote:
    > > Let us have a clear statement of their limitations, if limited they
    > > are.
    > >
    > > In short, do you believe there is a limitation on knowledge
    > > analogous to the Uncertainty Principle of QM, which applies
    > > to the simultaneous characterization of text in terms of
    > > grammatical qualities (defined distributionally)?
    >
    > Can you find two "observables" in grammar that can't in principle be
    > measured together? Two observables that are measured in such a way that
    > the measuring of one interferes directly with the measuring of the
    > other? I think that is the question you should ask if you want to find
    > a really convincing analogy, or alternatively discover that the model
    > isn't really appropriate.

    Almost any clustering of word associations to abstract grammatical categories
    breaks the same usage down in different ways.

    The point is, in general, some data is used in both analyses, but can't be
    part of both at once. So the analyses conflict. This will always occur in any
    distributional analysis (except that which is fundamentally parameterized by
    rules.)

    An example from my own experience is:

    He-PRON came-V only-DET yesterday-N
    (lined up with "He came just yesterday"?)

    He-PRON came-V only-ADV yesterday-N
    (lined up with "He came early yesterday"?)

    This is an example of an actual tagging dispute between myself and a colleague
    some years ago. Others on the list may have intuitions one way or another.
    That does not matter. The point is we have conflicting intuitions.

    In truth I think "only" in this sentence contributes both to our intuitions of
    what it is which makes an adverb, and what it is which makes a determiner
    (and other analyses besides.) The two analyses represent two alternate ways
    of ordering word combinations in the language. Neither order is more true,
    but you can't have both orders at once.

    > The Uncertainty Principle goes way further than just stating that you
    > can't know everything at once - it makes very precise statements about
    > what you can't know, based on what you've already measured. It thus
    > appears to be a very special kind of "knowledge incompleteness"
    > argument, and I don't know if it has linguistic counterparts.

    I think almost any distributional analysis will have this property, unless the
    underlying distribution is produced by rules. If language were produced by
    rules I accept there would not be any overlap, and thus conflict, between the
    categories. But we do see conflict.

    The failure of linguistics over the last 50 years to find objective rules
    describing language is something we could take as evidence of this. People
    have even measured it. As I said, Ken Church claimed as many as 3% of tagging
    decisions were disputed, even after negotiation between the taggers, in a
    study he published in 1992.

    Finally I think we have even more venerable evidence. You may have seen the
    thread a couple of months back where I raised Chomsky's observations from
    back in the '50s that grammar could not be learned distributionally.
    Chomsky's observations were of just the kind we seek. Bizarrely the
    conclusions he drew from them were that _distributional methods_ were wrong,
    rather than the premise that the distributions were parameterized by rules
    (even more bizarrely linguistics went for it, and threw out distributional
    methods, and corpora, for 20 years.) But the experimental results
    (indeterminacy in distributionally learned grammar) can be interpreted both
    ways.

    -Rob



    This archive was generated by hypermail 2b29 : Tue Dec 20 2005 - 04:39:55 MET