Re: [Corpora-List] Corpus Benevolence

From: Alexander Osherenko (osherenko@gmx.de)
Date: Sat Feb 10 2007 - 11:38:42 MET

Next message: Adam Kilgarriff: "RE: [Corpora-List] Corpus Benevolence"

Previous message: Lesley Carmichael: "[Corpora-List] Punctuation standards for speech transcription?"
Next in thread: Adam Kilgarriff: "RE: [Corpora-List] Corpus Benevolence"
Reply: Adam Kilgarriff: "RE: [Corpora-List] Corpus Benevolence"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Diana,

thank you for your comments. I've already thought I'm going mad with my
ideas. :)

>- on the contrary, if you want to look for the best corpus to test
>something that you have developed and are not sure holds water in other
>conditions, you'd better choose the most different corpus possible (from
>your initial one)
>
>
>
I do know that the results would be suitable if I take a different
corpus to test since there are always very many reasons to argue bad
results. ;-) I probably explain my ideas as follows: I have a small
corpus that's why I want to extend it. When do I stop to extend? When
the size of the corpus is big enough and what does it mean "big enough"?
In my case "opinion mining" does "big enough" correspond to the number
of "opinionated" expressions?

>I think your "general" measure has to be a
>specifically-related-to-opinion-mining measure...
>
>
>
It is probably the same what I meant in my previous comment.

>I have also written something on the subject of validating corpus-based
>results, see
>
>Santos, Diana & Signe Oksefjell. "Using a Parallel Corpus to Validate
>Independent Claims", Languages in contrast, Vol. 2(1), 1999, pp.117-132.
>[tell me if you want me to send it to you]
>
>
>
Could you please send me.

>Hope this is useful,
>
>
It was very useful.

Alexander

>---------------
>Diana Santos
>www.linguateca.pt
>Linguateca, SINTEF ICT
>Pb 124 Blindern, N-0314 Oslo, Norway
>
>
>
>
>>-----Original Message-----
>>From: owner-corpora@lists.uib.no
>>[mailto:owner-corpora@lists.uib.no] On Behalf Of Alexander Osherenko
>>Sent: 8. februar 2007 10:00
>>To: corpora@hd.uib.no
>>Subject: [Corpora-List] Corpus Benevolence
>>
>>Hello!
>>
>>Are there any measures that provide general estimation of the
>>benevolence of a corpus? The problem is - there are several
>>corpora, doesn't matter domain-specific or not, and I want to
>>find a general measure or general hints for choosing one or
>>another. How can I estimate what corpus I take besides that I
>>calculate result measures whatever they are and compare them
>>for every corpus previously chosen by chance?
>>Something like size, number of sentences, genre...
>>
>>Best,
>>Alexander
>>
>>
>>
>>
>
>
>

Next message: Adam Kilgarriff: "RE: [Corpora-List] Corpus Benevolence"
Previous message: Lesley Carmichael: "[Corpora-List] Punctuation standards for speech transcription?"
Next in thread: Adam Kilgarriff: "RE: [Corpora-List] Corpus Benevolence"
Reply: Adam Kilgarriff: "RE: [Corpora-List] Corpus Benevolence"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

This archive was generated by hypermail 2b29 : Sat Feb 10 2007 - 11:36:01 MET