Re: [Corpora-List] Google Books, copyrights, and corpora

From: Mark P. Line (mark@polymathix.com)
Date: Thu Jun 15 2006 - 17:48:04 MET DST

  • Next message: Mohsen Al-Thubaity: "[Corpora-List] Criteria to Building a Corpus for Text Classification"

    Doug Cooper wrote:
    >
    > The parallel to Napster is also hard to see. Taking a work apart,
    > then providing an automatic process to put it back together again,
    > clearly tries to make an end run around the law. But quite simple
    > limitations on corpus sample-serving (e.g. not allowing samples to
    > run over paragraph boundaries, and/or not identifiying samples with
    > their specific sources) would make it impossible for any number of
    > 14-year-old Python scripters to reconstitute the original texts.

    Yes. One of my points was exactly that. If limitations are designed into
    the online service, then there might not be any exposure. If such
    limitations are lacking, then not only the provider but even individual
    users of the service might find themselves in deep kimchee.

    > Bottom line, establishing that research applications of text corpora
    > is fair use is not a matter of 'snippet' defenses, and won't rise or fall
    > with Google. Rather, it's that our use and citation of text samples for
    > analytical purposes has little or nothing to do with the protection they
    > are given as creative literary works.

    Another of my points was that this can change practically overnight with a
    single appellate court decision. If you're Google, you can then go off and
    pursue other business while working towards the establishment of
    counter-precedents in other jurisdictions.

    Finally, I guess I should mention that it would _not_ be considered fair
    use to make copies of copyrighted works and give them away for free --
    commercial use is not a necessary criterion. I was trying to make the
    point that even this kind of philanthropic action can be considered
    criminal infringement fairly easily.

    IANAL. TINLA.

    -- Mark

    Mark P. Line
    Polymathix
    San Antonio, TX



    This archive was generated by hypermail 2b29 : Thu Jun 15 2006 - 17:49:15 MET DST