Re: [Corpora-List] Treebank 2 and 3

From: Ann Bies (bies@ldc.upenn.edu)
Date: Thu Oct 19 2006 - 02:16:28 MET DST

  • Next message: Christopher Brewster: "[Corpora-List] Jobs: Positions in Dialogue Management Systems and Emotional Quality in Speech Recognition"

    Dear Don,

    As I recall, the parsed portions of Brown that were included in the
    Treebank 2 release were not in the Treebank II/PTB-2 annotation style
    (i.e., it was data and annotation in the older style that was repeated
    from the earlier Treebank 1 release, with some technical errors fixed,
    but not reparsed in the newer annotation style).

    The portions of Brown that were included in the Treebank 3 release were
    newly annotated/parsed in the newer, more detailed Treebank II
    annotation style, but time and budget constraints prevented the
    re-annotation of the entire previous Brown corpus with the Treebank II
    style -- so only the portion of Brown that had been re-annotated in the
    new style was released in Treebank 3.

    Please do not hesitate to contact me if you have any further questions.

    Thanks,

    Ann

    Ann Bies
    Linguistic Data Consortium
    bies@ldc.upenn.edu

    Donald E Hardy wrote:
    >
    > I'm doing some work with Treebank 3, especially with the parsed
    > Switchboard and the parsed portions of Brown that are included in
    > Treebank 3. I noticed today that Treebank 2 has all of the Brown parsed
    > texts while Treebank 3 has only some of the Brown parsed texts. Does
    > anyone know why Treebank 3 includes only some of the parsed Brown texts
    > while Treebank 2 includes them all?
    >
    > Many thanks,
    >
    > Don
    >
    > Donald E. Hardy
    > Professor
    > Department of English/098
    > University of Nevada, Reno
    > Reno, Nevada 89557
    > DonHardy@unr.edu
    > http://textant.engl.unr.edu



    This archive was generated by hypermail 2b29 : Thu Oct 19 2006 - 09:23:55 MET DST