On 4/15/06, Philip Resnik <resnik@umiacs.umd.edu> wrote:
>
> Does anyone have a convenient perl subroutine or module that will
> convert Treebank parse trees into internal perl data structures?
Note that NLTK provides this functionality for Python programmers.
Here's how easy it is to use (for the treebank sample in
NLTK-Corpora).
--snip--
>>> from nltk_lite.corpora import treebank, extract
>>> print extract(0, treebank.parsed())
(S:
(NP-SBJ:
(NP: (NNP: 'Pierre') (NNP: 'Vinken'))
(,: ',')
(ADJP: (NP: (CD: '61') (NNS: 'years')) (JJ: 'old'))
(,: ','))
(VP:
(MD: 'will')
(VP:
(VB: 'join')
(NP: (DT: 'the') (NN: 'board'))
(PP-CLR:
(IN: 'as')
(NP: (DT: 'a') (JJ: 'nonexecutive') (NN: 'director')))
(NP-TMP: (NNP: 'Nov.') (CD: '29'))))
(.: '.'))
--snip--
Get NLTK from http://nltk.sourceforge.net/
For those still wedded to Perl for NLP, consider the following Perl
program to find all words in a text ending in "ing". Note the
'magic', the bits of syntax like <>, (split), my, $, =~, which reduces
readability:
while (<>) {
foreach my $word (split) {
if ($word =~ /ing$/) {
print "$word\n";
}
}
}
Here's the Python version, which contains far less magic:
import sys
for line in sys.stdin.readlines():
for word in line.split():
if word.endswith('ing'):
print word
-Steven Bird
This archive was generated by hypermail 2b29 : Fri Apr 14 2006 - 23:44:36 MET DST