Do you mean javascript?
I use vilistextum:
http://bhaak.dyndns.org/vilistextum/
and it seems to do a good job at removing javascript and html code.
Also, BTE (part of the Hyppia project):
reccommended to me on this list, tries to guess what is the "interesting"
content of a page, and removes everything else (thus, not only html and
javascript, but any text it believes to be boilerplate). If your goal is
precision rather than recall (i.e., it's ok to occasionally throw away
good content as long as what you keep is consistently good content), it
does an excellent job. It's a bit slow, though.
Regards,
Marco
This archive was generated by hypermail 2b29 : Mon Jun 27 2005 - 10:45:24 MET DST