Re: [Corpora-List] Perl reader for Treebank parse trees?

From: Nitin Madnani (nmadnani@gmail.com)
Date: Sat Apr 15 2006 - 01:50:42 MET DST

  • Next message: John Fry: "Re: [Corpora-List] Perl reader for Treebank parse trees?"

    Steven beat me to it !! I was just about to post that I have been
    using NLTK for a while now and it has the functionality Philip needs.
    May be it's finally time to switch to python, Philip ? :)

    Nitin

    On 4/14/06, Steven Bird <sb@csse.unimelb.edu.au> wrote:
    > On 4/15/06, Philip Resnik <resnik@umiacs.umd.edu> wrote:
    > >
    > > Does anyone have a convenient perl subroutine or module that will
    > > convert Treebank parse trees into internal perl data structures?
    >
    > Note that NLTK provides this functionality for Python programmers.
    > Here's how easy it is to use (for the treebank sample in
    > NLTK-Corpora).
    >
    > --snip--
    > >>> from nltk_lite.corpora import treebank, extract
    > >>> print extract(0, treebank.parsed())
    > (S:
    > (NP-SBJ:
    > (NP: (NNP: 'Pierre') (NNP: 'Vinken'))
    > (,: ',')
    > (ADJP: (NP: (CD: '61') (NNS: 'years')) (JJ: 'old'))
    > (,: ','))
    > (VP:
    > (MD: 'will')
    > (VP:
    > (VB: 'join')
    > (NP: (DT: 'the') (NN: 'board'))
    > (PP-CLR:
    > (IN: 'as')
    > (NP: (DT: 'a') (JJ: 'nonexecutive') (NN: 'director')))
    > (NP-TMP: (NNP: 'Nov.') (CD: '29'))))
    > (.: '.'))
    > --snip--
    >
    > Get NLTK from http://nltk.sourceforge.net/
    >
    > For those still wedded to Perl for NLP, consider the following Perl
    > program to find all words in a text ending in "ing". Note the
    > 'magic', the bits of syntax like <>, (split), my, $, =~, which reduces
    > readability:
    >
    > while (<>) {
    > foreach my $word (split) {
    > if ($word =~ /ing$/) {
    > print "$word\n";
    > }
    > }
    > }
    >
    > Here's the Python version, which contains far less magic:
    >
    > import sys
    > for line in sys.stdin.readlines():
    > for word in line.split():
    > if word.endswith('ing'):
    > print word
    >
    > -Steven Bird
    >
    >

    --
    Got Blog?
    http://greenideas.blogspot.com
    



    This archive was generated by hypermail 2b29 : Sat Apr 15 2006 - 01:50:08 MET DST