Is anyone aware of free software that will process PDF documents into
text streams? There is a PDF2HTML (with an XML option) that will create
page-centric versions, but this does not really distinguish text from
format. I want to ignore (or be able to treat separately) such things
as headers, footnotes, tables, figures, and equations. (Note that even
Google retains the page-centric view.)
Thanks,
Ken
-- Ken Litkowski TEL.: 301-482-0237 CL Research EMAIL: ken@clres.com 9208 Gue Road Damascus, MD 20872-1025 USA Home Page: http://www.clres.com
This archive was generated by hypermail 2b29 : Tue Mar 28 2006 - 17:35:02 MET DST