Gossamer Forum
Home : Products : DBMan : Customization :

Multiple HTML Pages

Quote Reply
Multiple HTML Pages
I have to get multiple html pages into a database. I have the Demo DB man installed and working.

The pages I am working on are simple archived articles. Currently I am cutting and pasteing the articles into the "description" space in the add form.

Is there a way to get them in the database without cutting and pasteing each one?

Thanks...
Quote Reply
Re: Multiple HTML Pages In reply to
Hi again,

Well, you've got dbMan up and running ok, but it seems to me that you're trying to run before you can walk. Before you can go any further, you need to set up your config file with the fields you'll want.

I can't tell you what you'll want, but to get you started, I would suggest:

ID: Same as in standard DB, can be handy for referencing later.

ShortName: A short name referencing the article will be handy for jumping direct to an article with the system you want.

LongName: The actual title of the article.

Author: The eh... author of the article.

Date: The original release date of the article.

And anything else you think will be needed.

Now, I'll try and explain how the system works. You create a directory on your server to contain the articles. I would suggest in your cgibin, since these pages don't need to be seen from the Internet, say /cgibin/directory/articles/. I'll tell you how to protect that directory from prying eyes later.

Now, in that directory, you name the articles according to a field in the database. I would suggest using ShortName for simplicity. In fact, I see from your website you already do this, for instance the file containing the article - "Education Summit attracts overflow crowd" is called educ_summit.html. So the shortname field referring to this article would be "educ_summit". We can now reference this article in dbMan, add on the .html, check to see if the file exists and is readable, parse the contents into Perl and deliver it back to the browser.

There are other factors to be taken into consideration. For instance, will you be editing the files to remove the html that won't be needed, such as HTML, TITLE and BODY tags, navbars and the like? You can skip this by adding XML type tags to the file, like <!-- Content --> and <!-- /Content -->, but it will slow the system down a little.

And if you want to get cute, a system can be set up to make the pages delivered by the database indexable by the search engines, using a little mod_rewrite trickery.

But when all is said and done, you have to decide if this will be worth all the effort in the end? Might it just be easier to set up dbMan as a links directory, linking to articles in an archive directory on your server. Something like this is really only useful on a truly dynamic site, where the headers, footers and navbars, etc. will be constantly changing. If that's the case, maybe you should think about installing a system to handle your whole website, archiving automatically, etc...

Sorry if I'm building you up and then knocking you down, but people seem to jump into adding systems to their websites quite often and then realise later that sticking with regular HTML might have been easier first day...

I'll help you anyway I can, but maybe you should have a think about your aims and targets before you continue...

adam
Quote Reply
Re: Multiple HTML Pages In reply to
You can skip putting them in the database altogether, put them in a common directory, name them according to one of the fields in the database and have the script "Include" them. But you would still have to edit the files to remove all the <HTML>, <HEAD>, <BODY>, etc. tags, defeating the purpose.

I have something like this working on a website of mine (not using dbMan though), but my pages have <!-- Content> and <!-- /Content --> tags in them that I inserted when I wrote them, to make it easier to cut this stuff out when parsing the file.

However, if someone would like to post a regex that will recognise <BODY> (obviously ignoring the rest of the content in the tag) and </BODY> tags, I'll post the code here for you, and try and help you install it.

Cheers,
adam
Quote Reply
Re: Multiple HTML Pages In reply to
Thanks for the help dahmasta. Could you do me another favor and explain these steps to me I am new at this and it was a chore for me to get the script up in the first place. Could you explain how to do things like creating a "common directory" and naming the file according to one of the fields in the database and have the script "Include" them"

Thanks again for your help...
Quote Reply
Re: Multiple HTML Pages In reply to
Okey-dokey,

First off, can you copy your config file over to a web accesible directory for me? Just copy the file locally and rename with a .txt extension. So if it's default.cfg, rename it to default.cfg.txt. Now upload it to your server in a directory I'll be able to access it from on the net.

Then post the URL here and we'll be able to get started.

Cheers,
adam
Quote Reply
Re: Multiple HTML Pages In reply to
I'

[This message has been edited by Explorer (edited May 23, 1999).]
Quote Reply
Re: Multiple HTML Pages In reply to
dahamsta,
Thanks for the quick response. I know it seems I am trying to bite off more than I can chew. I left some important factors out in my first post.

First, the articles that are currently on the server where placed there by another party. As you can see they each have their own page. I am handling the current articles now (about 25 every other week). I want to use DB man to handle the new articles because I am simply cutting and pasteing them from a quark document.

I want to get the old documents into the database also. I probably should suck it up and start cut and pasteing them in by hand but there are so many and I have limited time. That is why I was wondering if there was an easier way. If you could walk me through it I would appreciate it. If it is more trouble than it is worth that's ok too. I appreciate the time you have already taken to help me out, you guys are great!

...Explorer
Quote Reply
Re: Multiple HTML Pages In reply to
As I said, if you want to go for it, I'll help you out as best you can. However, you say:

Quote:
I want to get the old documents into the database also. I probably should suck it up and start cut and pasteing them in by hand but there are so many and I have limited time.

If you're in a real hurry to do this, I would suggest cutting and pasting, because a decent system like this can take some time to build, and you'll *still* have to modify each page so the script can read it. You could always come back to this later, and you will be working with dbMan and become more comfortable with working with it.

One other important factor. If you generate all of your articles with Perl, you're pretty much telling the search engines to take a hike, because most of them will ignore cgi output. As I said, you can avoid this with a little mod_rewrite trickery (if you're on Apache), but that just complicates matters even more.

I'm just in the closing stages of creating a dynamic website that calls pages in this manner, and it took me two weeks to do it. The system I use *doesn't* use dbMan, so integrating it could make it even more complicated.

Taking all that into account, if you want to go for it, start setting up your database with the correct fields and we'll get started.

Again sorry if I seem pessimistic, I'm just telling you the realities of it all... Smile

adam
Quote Reply
Re: Multiple HTML Pages In reply to
Thanks for all your help I will do it the old fashioned way. I want to do this thing right and sometimes shortcuts make things harder in the long-run. I will probably call on your expertise after I get this first task out of the way.

Thanks...
Quote Reply
Re: Multiple HTML Pages In reply to
No problem. You can always remove the description field and data later and do it this way.

Good luck!

adam