Gossamer Forum
Home : General : Perl Programming :

any module for PDF->html, DOC->html?

Quote Reply
any module for PDF->html, DOC->html?
Hi all,

Is there any perl module for converting PDF and DOC files to HTML? I could not find any at CPAN.

Thanks.

Long
Quote Reply
Re: [long327] any module for PDF->html, DOC->html? In reply to
Easiest thing is probably post to the pre made adobe script...

http://access.adobe.com:8088/ads-cgi/convert.pl
Quote Reply
Re: [Paul] any module for PDF->html, DOC->html? In reply to
Hi Paul,

Thank you for reply, but I need to use the module in my code. They don't provide the source code.

Long
Quote Reply
Re: [long327] any module for PDF->html, DOC->html? In reply to
>>
I need to use the module in my code.
<<

Why?
Quote Reply
Re: [Paul] any module for PDF->html, DOC->html? In reply to
I am working on a plugin which caches web pages for links in LinkSql database. It's essentially what Google does.

Long
Quote Reply
Re: [long327] any module for PDF->html, DOC->html? In reply to
There's a command line utility called 'pdftotext' (on linux) that might do the job. I am not aware of any perl solutions.

Ivan
-----
Iyengar Yoga Resources / GT Plugins
Quote Reply
Re: [long327] any module for PDF->html, DOC->html? In reply to
pdf2html

http://www.google.com/...r=&ie=ISO-8859-1

pdf2doc

http://www.google.com/...p;btnG=Google+Search

- wil
Quote Reply
Re: [long327] any module for PDF->html, DOC->html? In reply to
I don't know of a pure perl solution. The only thing I can suggest to do is install something like this:

ftp://atrey.karlin.mff.cuni.cz/...ocal/clock/pdf2html/

...and then you can execute it with a system command in your script.
Quote Reply
Re: [long327] any module for PDF->html, DOC->html? In reply to
Quote:
I am working on a plugin which caches web pages for links in LinkSql database. It's essentially what Google does.

That's fair enough but I'm not sure why you need the source code to be included in your script.