RE: [Corpora-List] Another "Search Inside" tool: Google Print...

From: Ute Römer (ute.roemer@anglistik.uni-hannover.de)
Date: Thu Jun 16 2005 - 13:40:40 MET DST

  • Next message: Vincenzo Pallotta: "[Corpora-List] CfP: EUROLAN 2005 Workshop "ROMANCE FrameNet""

    Dear all,

    David's message (Thanks, David! I didn't know about the update) reminded me
    of a related search tool which might also be of interest for some of you
    (maybe you know about it already, but I only discovered it a few weeks ago):
    Google Print (check http://print.google.com/ and
    http://print.google.com/googleprint/about.html). The system allows you to
    search the full text of a huge number of books (apparently, they collaborate
    with publishers and libraries; they don't say how many books have been
    scanned and uploaded so far though) and gives you selected pages from those
    books which contain your search string. It's not so much a concordancing
    facility but certainly a new way of doing (literature) research.

    A search for "corpus linguistics", for instance, retrieves 3,040 hits (with
    the Biber/Conrad/Reppen 1998 textbook topping the list);
    http://print.google.com/print?ie=UTF-8&q=%22corpus+linguistics%22&btnG=Searc
    h. You can then follow a link and separately search within each of the
    "corpus linguistics" books. For example, you find that "register variation"
    occurs on 45 different pages in Biber/Conrad/Reppen 1998, and there are
    links that take you to the scanned image of each of the relevant pages (with
    the search item highlighted). That option is also very useful when you need
    to check the page number of a quote and don't have the book at hand. You can
    also see which library near you has this book -- and, of course, where you
    can buy it.

    Best wishes... Ute

    ********************************************

    Ute Römer
    English Department
    University of Hanover
    Königsworther Platz 1
    30167 Hannover
    Germany
     
    Phone: +49 (0)511 762 2997
    Fax: +49 (0)511 762 2996
    E-mail: ute.roemer@anglistik.uni-hannover.de
    http://www.uteroemer.de
    http://www.fbls.uni-hannover.de/angli/
     
    > -----Original Message-----
    > From: owner-corpora@lists.uib.no [mailto:owner-corpora@lists.uib.no] On
    > Behalf Of David Oakey
    > Sent: Thursday, June 16, 2005 12:06 PM
    > To: CORPORA@UIB.NO
    > Subject: [Corpora-List] Additions to amazon.com "Search Inside" feature
    >
    > Apologies if I'm be reporting something that everyone already knows
    > about except me, but Amazon.com's "Inside this book" feature now
    > provides - for all books in its "Search Inside" scheme - a concordance
    > (in the sense of a frequency list rather than KWIC citations), text
    > statistics, and statistically improbable phrases (SIPs). A SIP works a
    > bit like an n-gram version of a keyword in Wordsmith Tools, with the
    > reference corpus being all the books in Amazon's "Search Inside" corpus.
    > If Amazon finds "a phrase that occurs a large number of times in a
    > particular book relative to all Search Inside books, that phrase is a
    > SIP in that book." On the shopping page for the book "Into the void with
    > Ace Frehley," (the notoriously spaced former guitarist in the rock band
    > KISS) for example, the SIP they list is "black nail polish". This is
    > impressive - and not at all improbable - if you know much about the
    > career of Ace Frehley.
    >
    > The concordance results are presented alphabetically, with more frequent
    > words shown in a larger font size. Text statistics include standard
    > readability indices (the Fog Index seems apt here) and they have a "fun
    > stats" section where they calculate words per dollar and words per ounce
    > (words per pound and words per kilo on amazon.co.uk). More information
    > on the Amazon site about the number of books in the scheme (yes, 120,000
    > books, 33 million pages etc., but that was nearly 2 years ago), their
    > subject areas, authorship details etc. would of course be useful. While
    > this is intended as a marketing feature (it "allows you to search
    > millions of pages to find exactly the book you want to buy"), I believe
    > it would be interesting to corpora list members in itself.
    >
    > Best wishes,
    >
    > David Oakey
    > ------------------------------
    > Lecturer in English Language
    > English for International Students Unit
    > University of Birmingham, UK
    > phone: + 44 121 4145703
    > email: d.j.oakey@bham.ac.uk
    > http://www.eisu.bham.ac.uk/staff/oakeydavid.htm
    > ------------------------------



    This archive was generated by hypermail 2b29 : Thu Jun 16 2005 - 15:12:03 MET DST