[Corpora-List] PDF Conversion

From: Ken Litkowski (ken@clres.com)
Date: Tue Mar 28 2006 - 17:35:03 MET DST

  • Next message: Tom Emerson: "Re: [Corpora-List] PDF Conversion"

    Is anyone aware of free software that will process PDF documents into
    text streams? There is a PDF2HTML (with an XML option) that will create
    page-centric versions, but this does not really distinguish text from
    format. I want to ignore (or be able to treat separately) such things
    as headers, footnotes, tables, figures, and equations. (Note that even
    Google retains the page-centric view.)

    Thanks,
            Ken

    -- 
    Ken Litkowski                     TEL.: 301-482-0237
    CL Research                       EMAIL: ken@clres.com
    9208 Gue Road
    Damascus, MD 20872-1025 USA       Home Page: http://www.clres.com
    



    This archive was generated by hypermail 2b29 : Tue Mar 28 2006 - 17:35:02 MET DST