Re: [Corpora-List] Incidence of MWEs

From: Rob Freeman (lists@chaoticlanguage.com)
Date: Fri Mar 17 2006 - 06:39:39 MET

  • Next message: Jean Veronis: "Re: [Corpora-List] Incidence of MWEs"

    On Thursday 16 March 2006 23:03, David Brooks wrote:
    > ...
    > My interest being in syntax, I'm interested in the implications of MWE
    > for evaluating parsers. That is to say, if you get something like "light
    > pen" in a corpus, it may be tagged as an N-bar, with either a compound
    > <N N> or an <Adj N>, but in principle the *syntax* will remain the same
    > (tag differences aside).
    >
    > I would imagine this is not the case for "of course", which doesn't
    > strike me as a natural prepositional-phrase; likewise "kick the bucket"
    > is /syntactically/ a transitive verb-phrase, but, and here is the core
    > of my original (underspecified) question, would it be tagged as a
    > transitive verb-phrase, or would it be tagged as an MWE - perhaps an
    > intransitive verb-like MWE?

    No disrespect intended, David, you are not saying anything different to the
    other posts in this thread. It is just your post presents the common
    misconception most clearly.

    As the old maxim goes, answers are easy, the difficult part is to find the
    right questions.

    How do we tag MWE's? Surely the question is are tags sensible parameters of
    language in the first place.

    Yet again tags are causing us problems. Why are we so married to tags?

    When will we see the real answer is that the idea of tags for language just
    does not fit. Tags require us to imagine there are two kinds of language,
    regular (parametrized by tags) and irregular (enumerated in lexicon.) And
    then we have a third kind of language, MWE's mysterious because it falls
    between the two.

    So we posit two distinct models, and then agonize over the mystery that real
    language displays instead properties which are properly that of neither.

    And the problem with that, we insist, is not that real language (MWE's) has
    properties of neither model, but rather that we can't extend our models to
    fit real language. As if the goal of linguistics is to fit language to
    existing models rather than to find models which explain language.

    Why bother with two models which don't work, and a separate (unknown) model of
    MWE's which fits neither.

    It is much simpler to imagine there is one kind of language, MWE's. Forget
    tags, explain MWE's and you no longer have a problem.

    And MWE's are easy to explain. You can model them as generalizations over
    usage. More frequent generalizations look like lexicon, less frequent
    generalizations look like syntax.

    Instead of worrying where MWE's start and stop, let's accept that MWE's cover
    all of language. All language is an MWE. Explain MWE's in terms of
    generalizations over usage and let's start thinking about how we can use
    these generalizations over usage, rather than worrying about defining where
    MWE's stop or start, and how they should be tagged.

    Or we can continue to debate the "problem" of MWE's.

    -Rob



    This archive was generated by hypermail 2b29 : Fri Mar 17 2006 - 07:13:33 MET