[Corpora-List] Co-occurrence stats from BNC

From: MCUSSHS (harold.somers@manchester.ac.uk)
Date: Fri Mar 17 2006 - 11:43:47 MET

  • Next message: Nicholas Sanders: "[Corpora-List] Eurolang - Intergroup backs PACE call for cultural nations to be recognised"

    Sorry if this is a dumb question: for a student project, we would like
    to get the following stats based on the BNC:
    (1) frequency (or probability) of all trigrams
    (2) co-occurrence stats for all word pairs (NOT bigrams, note) based on
    co-occurrence within the same sentence

    I assume that this is easy to compute, though time-consuming; and of
    course I understand that the data will be relatively sparse.

    So my question is, is this data available somewhere, e.g. someone has
    already done it; OR: what is the easiest ay to do it?

    Harold Somers



    This archive was generated by hypermail 2b29 : Fri Mar 17 2006 - 11:43:14 MET