HP3000-L Archives

May 2006, Week 3

HP3000-L@RAVEN.UTC.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Gehan Gehale <[log in to unmask]>
Reply To:
Gehan Gehale <[log in to unmask]>
Date:
Mon, 15 May 2006 20:24:36 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (23 lines)
Hello tech friends,
 I have a friend who's looking for a way to read pdf files in to a database
and I told him i'd ask around. He has a bunch of PDF files which are tables
of accounting data. He'd like to read this data in to a database or text
file that he can then import into a database. But he's not sure how to go
about parsing the PDF, neither am I. I looked at a few of the PDF files in
plain text format and I can't make out how to go about this. I tried a few
ways of copying the data out using the text select tool but it always comes
out very disorganized.

 If anyone has any experience in this or any information it would be much
appreciated. We both have plenty of programming experience so even if you
have an API on how to build PDF's I'm sure we could work backwards from
there and develop something in java which could read them out in to tables?


Hoping for a long shot,
Gehan Gehale
[log in to unmask]

* To join/leave the list, search archives, change list settings, *
* etc., please visit http://raven.utc.edu/archives/hp3000-l.html *

ATOM RSS1 RSS2