From: Gavin McKenzie (gavin.mckenzie_at_gmail.com)
Date: Thu Oct 28 2004 - 06:37:25 PDT
Hendrik,
On Thu, 28 Oct 2004 11:19:59 +0200, Hendrik Lipka <hendrik.lipka_at_gmx.de> wrote:
>[snip] ...I will try some libraries
> for extracting text, and the best one (or maybe multiple ones) will
> the get included. ...[snip]
Have you tried to use the Adobe accessibility web/email-interface for
converting PDF to HTML as a method of text extraction?
http://www.adobe.com/products/acrobat/access_onlinetools.html
Have you found other libraries or methods for PDF->text conversion to
be better than the Adobe web interface?
Gavin.
P.S. I wish that the Adobe web interface allowed you to file-upload a
document into the converter rather than having to provide a
web-addressable PDF or resort to email submission of a PDF. I should
inquire about why file-upload isn't an option that Adobe provides.
-- This is the NewtonTalk list - http://www.newtontalk.net/ for all inquiries Official Newton FAQ: http://www.chuma.org/newton/faq/ WikiWikiNewt for all kinds of articles: http://tools.unna.org/wikiwikinewt/
This archive was generated by hypermail 2.1.5 : Thu Oct 28 2004 - 07:30:02 PDT