Re: [Corpora-List] Query on the use of Google for corpus research

From: Przemek Kaszubski (przemka@amu.edu.pl)
Date: Thu Jun 02 2005 - 20:10:43 MET DST

  • Next message: Andy Roberts: "Re: [Corpora-List] SEMCOR"

    Linda,

    You might want to try WCopyFind from
    http://plagiarism.phys.virginia.edu/Wsoftware.html
    It's free.

    Przemek

    Linda Bawcom wrote (2005-06-02 14:02):

    > Dear Nancy,
    >
    > Although I have not been following this thread too closely since I am
    > not using the web as a corpus, your reference to a tool for near
    > duplicate detection caught my eye as my small corpus (approx. 120,000
    > running words) is taken from Lexis Nexis newspaper articles. Although
    > I tried to avoid it as much as possible, a number of the articles are
    > compilations from newswire services and/or major newspapers (and who
    > knows how many others are but perhaps not cited).
    >
    > While I realize that through my concordance lines I can detect some
    > duplication, due to the kind of research I'm doing, it would be
    > extremely helpful to know exactly how much has been borrowed. (For
    > that reason, I was even entertaining the thought of running them
    > through a program that is used to identify plagiarism, but
    > unfortunately they are rather pricey and the University of Houston
    > doesn't (to the best of my knowledge) have a program such as this -at
    > least not available to adjuncts!)
    >
    > Truthfully, in the end, it may not be of any importance, but I would
    > like to cover the possibility. Any advice or suggestions you or anyone
    > else who has used newspaper articles for their corpus might have would
    > be very much appreciated.
    >
    > Best wishes,
    > Linda Bawcom (currently at the University of Liverpool)

    -- 
    Dr Przemyslaw Kaszubski
    +48 61 8293515
    http://elex.amu.edu.pl/ifa/staff/kaszubski.html
    

    PICLE LEARNER CORPUS ONLINE: http://www.staff.amu.edu.pl/~przemka/picle.html

    COMPREHENSIVE CORPORA BIBLIOGRAPHY: http://www.staff.amu.edu.pl/~przemka

    MY SEMINARS: http://www.staff.amu.edu.pl/~przemka/seminars.htm

    ACADEMIC WRITING PAGE (FULL-TIME PROGRAMME): http://www.staff.amu.edu.pl/~przemka/IFA_writing

    ======================================= School of English (IFA) Adam Mickiewicz University http://elex.amu.edu.pl/ifa =======================================



    This archive was generated by hypermail 2b29 : Thu Jun 02 2005 - 20:50:11 MET DST