Re: [Corpora-List] Corpus Benevolence

From: Alexander Osherenko (osherenko@gmx.de)
Date: Sat Feb 10 2007 - 11:38:42 MET

  • Next message: Adam Kilgarriff: "RE: [Corpora-List] Corpus Benevolence"

    Hi Diana,

    thank you for your comments. I've already thought I'm going mad with my
    ideas. :)

    >- on the contrary, if you want to look for the best corpus to test
    >something that you have developed and are not sure holds water in other
    >conditions, you'd better choose the most different corpus possible (from
    >your initial one)
    >
    >
    >
    I do know that the results would be suitable if I take a different
    corpus to test since there are always very many reasons to argue bad
    results. ;-) I probably explain my ideas as follows: I have a small
    corpus that's why I want to extend it. When do I stop to extend? When
    the size of the corpus is big enough and what does it mean "big enough"?
    In my case "opinion mining" does "big enough" correspond to the number
    of "opinionated" expressions?

    >I think your "general" measure has to be a
    >specifically-related-to-opinion-mining measure...
    >
    >
    >
    It is probably the same what I meant in my previous comment.

    >I have also written something on the subject of validating corpus-based
    >results, see
    >
    >Santos, Diana & Signe Oksefjell. "Using a Parallel Corpus to Validate
    >Independent Claims", Languages in contrast, Vol. 2(1), 1999, pp.117-132.
    >[tell me if you want me to send it to you]
    >
    >
    >
    Could you please send me.

    >Hope this is useful,
    >
    >
    It was very useful.

    Alexander

    >---------------
    >Diana Santos
    >www.linguateca.pt
    >Linguateca, SINTEF ICT
    >Pb 124 Blindern, N-0314 Oslo, Norway
    >
    >
    >
    >
    >>-----Original Message-----
    >>From: owner-corpora@lists.uib.no
    >>[mailto:owner-corpora@lists.uib.no] On Behalf Of Alexander Osherenko
    >>Sent: 8. februar 2007 10:00
    >>To: corpora@hd.uib.no
    >>Subject: [Corpora-List] Corpus Benevolence
    >>
    >>Hello!
    >>
    >>Are there any measures that provide general estimation of the
    >>benevolence of a corpus? The problem is - there are several
    >>corpora, doesn't matter domain-specific or not, and I want to
    >>find a general measure or general hints for choosing one or
    >>another. How can I estimate what corpus I take besides that I
    >>calculate result measures whatever they are and compare them
    >>for every corpus previously chosen by chance?
    >>Something like size, number of sentences, genre...
    >>
    >>Best,
    >>Alexander
    >>
    >>
    >>
    >>
    >
    >
    >



    This archive was generated by hypermail 2b29 : Sat Feb 10 2007 - 11:36:01 MET