Re: [Corpora-List] Using MTurk for markup tasks (was Cost of part of speech tagging)

From: Alexandre Rafalovitch (arafalov@gmail.com)
Date: Tue Dec 26 2006 - 22:51:21 MET

  • Next message: Mohand-Said Hacid: "[Corpora-List] WISE 2007: Call For Workshop Proposals"

    On 12/26/06, Mike Maxwell <maxwell@umiacs.umd.edu> wrote:
    > Alexandre Rafalovitch wrote:
    > > An interesting approach would be to use Amazon Mechanical Turk for
    > > this kinds of tasks.
    > > ...
    > > Has anybody else given a thought to this?
    >
    > Don't know what languages you're interested in. I have thought about
    > "wikifying" other sorts of projects (like finding and keeping track of
    > on-line computational resources, or building bilingual text collections)
    > for "low density" languages.

    Actually, wikification is a different, though also interesting, idea.
    Wikification would be about content presentation and markup, while
    MTurk would be about the workflow and process of actually marking up
    the text. I think using generic Wiki for POS marking may not be very
    efficient. A specialised programme that allows to do it fast, would be
    more effective. These programmes exist as standalone applications, but
    not as online interface and certainly not as MTurk interface yet.
    (AFAIK). Obviously, if workflow and presentation could be combined
    into one interface, the benefits would compound.

    > My guess is that "wikification" (including the Amazon Mechanical Turk
    > under this) will work best for languages where there are a substantial
    > number of speakers with idle time, sufficient income to afford the
    > computer and network connection, and sufficient education for the
    > specific annotation task.

    My proposed target is Students and Research assistants in the fields
    on Linguistics and Computational Linguistics. They (should) have
    training, access to the network connection (through their
    universities) and need for making income in time-flexible fashion
    around their other duties/studies. Languages are obviously an issue,
    but with already distributed nature of MTurk, it might be possible to
    reach the language speakers where they are rather than where you would
    need them to be with centralised architecture.

    Regards,
      Alex.



    This archive was generated by hypermail 2b29 : Tue Dec 26 2006 - 22:49:55 MET