Gossamer Forum
Quote Reply
PDF Plugin
Hi Folks

Is there a PDF plugin available for Links SQL? As in, can the system produce individual PDF output of every 'detailed' page?

If not, would GT, or anyone else be interested in writing one for me? If you would, drop us a note => wil@stephens.org

Thanks!

- wil
Quote Reply
Re: [Wil] PDF Plugin In reply to
Ouch.. :)

I'd check the GNU/Open Source sites first., to see if there are any such projects in the works. If such a project exists, and is close to completion, it should be fairly easy to format, and pipe Links output into it.

_BUT_ developing a PDF plugin from scratch is -- I think I'm safe to say -- above and beyond the call of duty here :)

At one time PDF was Adobe's exclusive domain, and I don't know if the format is now in the public domain, or if it still has licensing restrictions of some sort.

Adobe has an on-line PDF system, where you can submit documents.

For an annual fee they will give unlimted service.

This might be a solution to your needs. If you need something on the fly, from them, maybe ask this question of them -- can their system take input from a script, and pass back the information on how to obtain the results via script.


PUGDOG� Enterprises, Inc.

The best way to contact me is to NOT use Email.
Please leave a PM here.
Quote Reply
Re: [Wil] PDF Plugin In reply to
Your solution may be here: http://www.easysw.com/htmldoc/
Quote Reply
Re: [Wil] PDF Plugin In reply to
Hi,

I wanted to do this for long but had no time yet. I guess it should not be that complicated if

you look at http://www.cpan.org eg at http://search.cpan.org/search?dist=PDF-Create

Regards

Niko
Quote Reply
Re: [YoYoYoYo] PDF Plugin In reply to
Thanks for the information, folks.

I understand this is way beyond the realms of any 'free' plugin from anyone. I was more enquiring about the possibility of such a plugin.

I'm saying this because we are managing a directory online, or will be when we get Links up and running. We would also like to make a printed publication of the directory, and one simple way of doing this is to create PDFs of every page with a pre-defined template to send to the printers.

This could be exactly what I'm looking for:

http://www.easysw.com/htmldoc/pdf-o-matic.php

- wil
Quote Reply
Re: [el noe] PDF Plugin In reply to
Thanks for the link to CPAN.

Using a perl module, it shouldn't be difficult at all (hopefully). When I get my copy of Links up and running; creating a PDF plugin will be my highest priority. I'll let you know how I get on.

- wil
Quote Reply
Re: [Wil] PDF Plugin In reply to
I can come up with something for you but it would be easier if you are open to installing something on your server.

Also depending on the number of links it may slow down the build quite a bit.

I thought about dynamic generation?

For example next to links you could have:

Detailed Page - PDF Page

....then I could get the script to generate the pdf if one isn't detected for that ID or go straight to the existing one if it has already been generated?
Quote Reply
Re: [Paul] PDF Plugin In reply to
Hi Paul

Yes this would definitily be something I will be installing on my server; basically it looks like I just need to grab that module off CPAN.

I don't want this to be a dynamic thing. Basically the idea is that we will produce a printed book out of the contents of this directory annualy, so I just need something that will trawl through all the records in the database and create PDFs based on a given template. Shouldn't be too difficult, I hope.

The book in question is 'The Wales Yearbook => http://www.walesyearbook.co.uk/' which we are hoping to move to Links. The reason for this is that we can use a central database for the web stuff and the printed side of it. This theoretically should reduce our workload quite a bit.

- wil
Quote Reply
Re: [Wil] PDF Plugin In reply to
I have something else in mind other than the CPAN module. I don't have the URL offhand but I installed it the other day for a client and it works nicely.

It is a program that needs compiling and can be run from a script or ssh/telnet
Quote Reply
Re: [Paul] PDF Plugin In reply to
Interesting. But I think this module will do everything I need it to. I'll just have to write a script to exctract info from database, and fill in the blanks on my template then run it through the PDF parser in batches. I don't want them to all go at the same time as it's probably quite server-intensive. Hm, now a parallel solution would be good here :-)

- wil
Quote Reply
Re: [Wil] PDF Plugin In reply to
Hi Wil,

I just stumbled across this thread of yours. In case you are still looking for a solution: my PageBuilder Plugin, which will be released soon, uses htmldoc to produce PDF pages. It can produce detailed PDF pages for all links.

Ivan
-----
Iyengar Yoga Resources / GT Plugins
Quote Reply
Re: [yogi] PDF Plugin In reply to
Sounds good. Can you give me more information on this 'htmldoc' business? How well would it cope with converting 16,000 records to PDF pages?

- wil
Quote Reply
Re: [Wil] PDF Plugin In reply to
Hi Wil

I tested the building of PDF documents using my PageBuilder plugin in conjunction with htmldoc. As a testing ground, I used my own database, consisting of 400 links. I built one full PDF page per link. It took about 50 seconds to build the links. For the building of HTML pages with the same content, it takes about 18 seconds, so as a rule of thumb, it takes about three times longer to build PDF pages than building HTML pages.

So it would take about 40*50= 2000 seconds = 35 minutes for your 16000 links (on my machine).

These figures are of course only very rough estimates, but you can see it's not so much slower than producing HTML pages.

Ivan
-----
Iyengar Yoga Resources / GT Plugins
Quote Reply
Re: [yogi] PDF Plugin In reply to
Things slow down as the directory grows larger. 16,000 pages in one directory could slow things down a lot.

Just a thought.


PUGDOG� Enterprises, Inc.

The best way to contact me is to NOT use Email.
Please leave a PM here.
Quote Reply
Re: [pugdog] PDF Plugin In reply to
Are you saying that building time does not grow linearly, but faster?

Ivan
-----
Iyengar Yoga Resources / GT Plugins
Quote Reply
Re: [yogi] PDF Plugin In reply to
Thanks for the info, yogi.

Can you give me more information about this htmldoc program, though? Is it a program? Is it free? Or is it a Perl module? What alternatives, if any, did you try?

Cheers!

- wil
Quote Reply
Re: [Wil] PDF Plugin In reply to
http://google.yahoo.com/bin/query?p=htmldoc&hc=0&hs=21
Quote Reply
Re: [Wil] PDF Plugin In reply to
Hi Wil

htmldoc website: http://www.easysw.com/htmldoc/

It's a C program, licensed under GPL, so you can use it for free. It runs well under Linux, I compiled it on my server, no problem. There is also a free Windows executable available without command line interface, which you need for it to work with the plugin. That means that you have to build your own executable with command line support for Windows servers.

I chose this program, because it is the only solution I found that supports template based PDF creation. Meaning that you can use a normal html template (as you use it with Links SQL), parse it with the GT template parser, and then produce the PDF output. The translation for HTML to PDF is good, I tested many HTML tags, and the corresponding PDF output was as expected.

There is also the PDF::Template module, which takes an XML template and produces PDF output. The disadvantage of this solution is that people would have to learn a new template language, and also that you need to have pdflib installed.

Other perl modules (i.e. pure perl solutions) have much lower level interfaces, you actually have to write your own code to produce PDF output. While this is fine for many situations, it is not what I was looking for in this case.

Ivan
-----
Iyengar Yoga Resources / GT Plugins
Quote Reply
Re: [yogi] PDF Plugin In reply to
Thanks for the info, Ivan. Most helpful. I'm actually looking at a very strong level of control for the style and layout of these PDF documents which will be very different from the look of the HTML output. I *think* the best way forward for me would be to feed XML data out of my Links database into a template parser which would then create the PDF templates.

- wil
Quote Reply
Re: [yogi] PDF Plugin In reply to
as a directory gets more entries, there is a larger look-up table. Most operating systems work best when the number of files is under 100, and certainly under 1000. Once you start to get into 10's of thousands, you start to notice problems.

I don't have any specific numbers, but this is the reason for 'hashing' the directories where huge volumes of lookup data is stored.

like: /path/to/data/a/ab/absolute.html

or similar,

or for links:

/path/to/data/1/19/190/19032.html

That's sort of inefficient, so you'd most likely do:

/path/to/data/19/1903/19032.html

(it's easier to do that, than pick off 19/03/19032.html, if things are critical on performance).

That would give you a directory of 100 subdirectories, each with 100 subdirectories, each containing however many files you'd get when you divide your total-links / (100*100)

That would be 10,000 directories, and might be overkill for directories with under 50,000 links. <G>

/path/to/data/19/0/19032.html

which would give you 1,000 directories, and would work well for link directories of 100,000 links or less <G>

Anyway.... I don't have the actual performance data, but it's been brought up as an issue before on the "build" directories.


PUGDOG� Enterprises, Inc.

The best way to contact me is to NOT use Email.
Please leave a PM here.
Quote Reply
Re: [Wil] PDF Plugin In reply to
PDF generation might be a good candidate for on demand generation with caching. In a nutshell, send the user to a cgi which will determine if there is a valid cached copy of the PDF document and if there is: send it, if not or the previous version has expired: generate-cache-regurgitate.
Quote Reply
Re: [Aki] PDF Plugin In reply to
Definitily. I couldn't agree more.

However, for my paticular needs, the PDF generation will be done on a monthly basis -- once and once only. They will not be available via CGI online. They are to go towards a published book.

Cheers

- wil