Re: [Corpora-List] Re: problems with Google

From: Pascal Soucy (pascal.soucy.1@ulaval.ca)
Date: Thu Mar 17 2005 - 16:07:37 MET

  • Next message: Massimo Poesio: "[Corpora-List] Job: Readership / Senior Lectureship in CS at Uni Essex (NLP encouraged)"

    Googles does that with all stopwords. If you search for:

    what does "the" "the" mean, you'll get the same behavior. Google ignores
    stopwords (and * seems to managed as a stopword).

    Both the queries:

    what does "*" mean

    and

    what does "*" "*" mean

    results in about the same list of documents. The difference between the two
    occurs in the ranking process. The ranking algorithm likely use term proximity
    so to better match the query as it is written and it keep the position of
    stopwords in the query to do that.

    Pascal Soucy
    Coveo

    Selon John Milton <lcjohn@ust.hk>, 17.03.2005:

    > I just discovered that Google seems to have retained some use of the
    > wildcard for words if you use double quotes with the asterisk. A search
    > for "what does "*" mean" and "what does "*" "*" mean" results MAINLY in
    > any one and two words respectively. If anyone else is using web searches
    > as language learning/teaching resources, this also looks promising:
    > http://www.findforward.com/
    >
    > John Milton
    > Hong Kong University of Science & Technology
    >
    >
    >
    >



    This archive was generated by hypermail 2b29 : Sat Mar 19 2005 - 20:41:17 MET