Re: [Corpora-List] Incidence of MWEs

From: Mike Maxwell (maxwell@ldc.upenn.edu)
Date: Sun Mar 19 2006 - 16:55:01 MET

  • Next message: Rob Freeman: "Re: [Corpora-List] Incidence of MWEs"

    Rob Freeman wrote:
    > Surely the question is are tags sensible parameters of language in
    > the first place.

    I am not sure what you mean by "parameters". I believe the original
    idea about tags, which you will find in textbooks, is that tags
    allow us to make approximations to real language, approximations that
    are useful in certain kinds of computation. If we had a complete
    understanding of the mental processing underlying language (and
    arguably, pragmatics and everything else), and maybe much more computing
    power, we wouldn't need tags. (But I think we would need syntax and
    morphology and a host of other things that linguists have traditionally
    studied.) I don't believe most researchers would consider tags to be a
    theoretical construct--they're an engineering construct.

    Having said that, tags frequently bear an obvious relationship to parts
    of speech (aka categories, such as noun, verb...) and morphosyntactic
    features (past tense, plural subject...). And these are linguistic/
    scientific notations (although one may of course argue how well
    motivated any particular one is). They allow us to draw generalizations.

    > Instead of worrying where MWE's start and stop, let's accept that
    > MWE's cover all of language. All language is an MWE.

    Except for this, which isn't an MWE. And except for your posting, which
    isn't an MWE either (at least not one that I've ever seen before).

    > Explain MWE's in terms of generalizations over usage and let's start
    > thinking about how we can use these generalizations over usage

    Uh, let's see. Here's a generalization over usage: the MWE "kick the
    bucket" has a distribution much like the MWE "fire off a shot", which
    has a distribution much like the MWE "pick up the pace", etc. Let's
    make up a label for these MWEs that obey this generalization--I dunno,
    maybe "VP".

    Then we notice the generalization that there are lots of variants of
    each of these MWEs where the first word has an 's' (or 'es') on the end,
    or a 'ing' on the end, or a 'd' (or 'ed'). Let's call that word a "V",
    and the things that go on the end "verbal suffixes". And we may also
    notice the generalization that a 'V' can be immediately followed by a
    'verbal suffix'.

    Oh, but those Vs we noticed also take part in other MWEs--and for that
    matter, in things that don't look particularly like MWEs, in the sense
    that there's not much repeated at the word level, only at the category
    level. So we'll call all those VPs, too.

    I am reminded of a story I saw more years ago than I care to say, in a
    caving newsletter (this was before the days of blogs, which gives you an
    idea of how ancient it was). The idea was that there was a danger in
    using a single rope (this was for pit caves, where you have a free
    descent): the rope might rub across a rock, and fray. So it would be
    safer to use two ropes, so that if one broke while you were half way up
    (or down), you'd still have the other. But if two ropes are twice as
    safe as one, three ropes would add another 50% safety margin. And so forth.

    But of course the chances of multiple ropes failing at the same time is
    very small, and a large number of ropes is heavy. So you could reduce
    the diameter (and therefore the weight) of each rope, while still
    maintaining an adequate safety margin.

    But then, all those cords get difficult to manage--they tangle. Ah, but
    you could overcome that problem by braiding the cords together!

    As you've probably guessed, there is a moral to this, which happens to
    be an MWE: what the left hand takes away, the right hand gives back.
    (But I find the non-MWE version far more entertaining.) Anyway, I
    suspect that an MWE version of language will end up looking an awful lot
    like some existing theories of syntax and morphology--HPSG, maybe.

        Mike Maxwell



    This archive was generated by hypermail 2b29 : Sun Mar 19 2006 - 18:55:07 MET