[Corpora-List] a new member

From: Mai Zaki (MZ106@mdx.ac.uk)
Date: Sat Feb 26 2005 - 05:23:52 MET

  • Next message: bill_lang: "[Corpora-List] text XML representation for NLP"

    Hello everyone,
     
    It is a pleasure to join your group.
     
    I am a PhD student at Middlesex University and I am just starting my research to put together a formal proposal. My aim is to do a corpus-based study of repetition, comparing the various fiction and non-fiction, written and spoken text categories all within the framework of Relevance Theory. I am kind of a beginner in this field of corpus linguistics. I just did a small scale corpus-based study of the modals in my MA thesis using a corpus I compiled myself and a concordance software. Now I am hoping I can use one of the big English corpora like the ICE-GB or the BNC. But I am basically worried about the range of examples a one-million word corpus or a 2000-word text collections corpus would generate. I was also wondering if it would be feasible for such a study just to go through the whole corpus looking for repeated words or phrases since no search tool would be particularly useful, and whether the layout of the data in either corpora would allow me to detect cases of repetition both on senence and discourse levels easily. I would really appreciate it if anyone could provide me with useful information in this regard, especially from those who actually worked with these corpora before. And if anyone can recommend other corpora for such a study would be most welcomed.
     
    Thank you all.
     
    Mai Zaki



    This archive was generated by hypermail 2b29 : Sat Feb 26 2005 - 07:04:56 MET