Re: [Corpora-List] Incidence of MWEs

From: Will Fitzgerald (will.fitzgerald@gmail.com)
Date: Wed Mar 15 2006 - 17:40:53 MET

  • Next message: D.G.Damle: "[Corpora-List] Author+'s plans for books"

    The thing is, the various meanings of 'pencil sharpener', 'crayon
    sharpener' and 'stick sharpener' are all predictable, just not from
    their immediate lexical items. I think that any 'tool for Verbing
    Noun' or a 'tool for Verbing, shaped like a Noun' will apply in Noun
    Verb-er expressions. Certainly, because there is a greater need for
    pencil sharpeners, pencil sharpeners tend to have standard shapes &
    components, but a pencil sharpener that worked via laser beams would
    still be a pencil sharpener. And imagine a tool for sharpening knives
    that had a graphite core; in the proper context, 'pencil sharpener'
    (or maybe even 'pencil knife sharpener' is ok.

    The point is that general real-world knowledge, plus rules of phrasal
    combination, create predictable meanings for some expressions that are
    not predicatable based on the lexical meanings.

    Oh, by the way, here is a 'pencil pencil sharpener':
    <http://www.shop-eds.com/ProductDetail.aspx?prntdid=1810&did=1828&pid=23623>

    On 3/15/06, Amsler, Robert <Robert.Amsler@hq.doe.gov> wrote:
    > I have found published dictionary's judgments as to what constitute MWEs
    > to be both dated and biased against declaring MWEs to exist. Until I
    > actually went through a number of texts to extract MWEs by hand and
    > compared those MWEs I found against those listed in dictionaries I used
    > to think the lexicographic coverage was adequate and followed the rule
    > that "if you can predict its meaning from its constituent parts, it
    > doesn't need a separate entry" to be correct. What I found was that not
    > only didn't the rule seem to be applied consistently, but that MWEs
    > appeared to be a much neglected area of lexicography with many more
    > undocumented MWEs being used in text than were in the dictionaries. It
    > was as though dictionaries reviewed their MWE entries far less often and
    > less diligently than they did their isolated word entries.
    >
    > There are probably good reasons against dictionary publishers declaring
    > MWEs to exist. Namely, MWEs greatly increase the size of a dictionary
    > for a small gain in clarity, perhaps only useful to Speakers of English
    > as a Foreign Language (and practitioners of computational linguists,
    > information retrieval and artificial intelligence). The "prediction"
    > rule used to discount MWEs needing entries seems to beg the question of
    > what algorithm can predict these and what does that algorithm predict.
    > There is a big difference between believing you are excluding MWEs
    > because they are understandable without definitions and having an
    > algorithm that can generate the definition you would have written from
    > the separate dictionary entries for the component words.
    >
    > Take an MWE such as "pencil sharpener". Most dictionaries don't define
    > this since according to the prediction rule, it could be assumed to be
    > just "a sharpener for pencils". However, that denies the fact that we
    > all know pencil sharpeners are a specific category of manufactured
    > product and if you look for a photo of a pencil sharpener it will have
    > one of several distinct models. We also know details about how pencil
    > sharpener's work. In contrast, things like a "stick sharpener" or a
    > "crayon sharpener" are novel creations without long-standing precedent
    > (I just checked the web, and, sigh, they both exist, but a "stick
    > sharpener" isn't a tool for sharpening sticks, it is a knife sharpener
    > whose shape resembles a stick, i.e., a thin cylindrical file.")
    >
    > A pencil sharpener would be something like "an electrical, mechanical or
    > manual device with sharpened blades into which pencils can be inserted
    > and which when operated creates a tapered conical pointed tip on the
    > pencil which initializes or renews its ability to be used as a writing
    > implement"
    >
    > Here is where I would say computational linguistics has to take its
    > leave of lexicography (or at least published lexicographic practice) and
    > declare "pencil sharpener" to be a useful and necessary MWE. I would
    > even go so far as to say that every MWE for which an explicit definition
    > can be written, should have an explicit definition and that ONLY when
    > the explicit definitions show no differentiation should they be
    > eliminated in favor of entries for the separate word elements. That is,
    > REVERSE the "prediction" rule to assume you cannot predict the meaning
    > of an MWE until you fail to find anything to say in its definition that
    > is not formulaic.
    >
    > I don't believe published dictionaries contain sufficient information to
    > correctly understand the MWEs they fail to explicitly list. I don't
    > believe published dictionaries actually think about MWEs consistently or
    > conscientiously.
    >
    >
    >
    >
    >
    >
    >
    >
    >

    --
    Will Fitzgerald
    weblog: <http://www.entish.org/willwhim>
    



    This archive was generated by hypermail 2b29 : Wed Mar 15 2006 - 17:42:17 MET