Gossamer Forum
Home : Gossamer Threads Inc. : Discussion :

Re: New SQL version

Quote Reply
Re: New SQL version In reply to
If the category build stats is no longer needed, and the stats are recorded to the databsae in real-time, so that the links maintain their current counts, it would be possible to have a build-changed option.

Right now, what changes?

The count on the links -- and thus the detailed page.

The category page the link is listed on.

The Top-rated, popular and any other generated pages.

So, you'd need to have jump.cgi write to a table similar to build_update, that would track which links were changed (not how many times it was changed) and then, on a partial build it takes the links, tallies up which categories have changed, and creates a "to do" list of sub-sections to rebuild.

Logic could be added to decide if the Top or Pop pages have changed, but since they are single select/write it's probably worth it to update them.

Most time is taken writing out the detailed pages if they haven't changed.

Next is the category pages.

Most likely, on even fairly active sites, most of the category pages will change between builds, so deciding which have changed and which haven't becomes a trade off in processing time vs writing time.

_MOST_ speed gains would be from deciding which detailed pages have changed, and rebuilding only those. That would be fairly easy.

Complex sites might also benefit from selectively rebuilding the category pages, but that will be a much smaller speed gain.

For instance, I could rebuild the site without detailed pages in about 15-20 minutes. With detailed pages, it took over 3 hours.

Now, some things that _could_ be done:

1) GT has threaded modules used for the spider and the link verification. That logic could be used to spawn jobs to do sequential or parallel builds.

2) Categories could be built in parts from the top down -- starting with each main category, and if you have a big site, 2nd level categories.

3) This would have to be called from the browser, the way it was done in links 2.0, in order to get around the time-limit/cpu-limit many ISP's impose. Each succesive step would have to be a new process call, not a child process. It could also be fired from a cron job, but this is more tricky, and would need to be done at off hours.


Anyway......

Digging into the logic, it _can_ be done, but it's probably part of the next release, or it could be easily incorporated -- although probably complex to program.

My growing concern is that I've gotten so familiar with the parts of Links 1.11 that I'm using, what sort of learning curve will there be to adjust to the next release?

http://www.postcards.com
FAQ: http://www.postcards.com/FAQ/LinkSQL/

Subject Author Views Date
Thread New SQL version Robert 3986 Jul 8, 2000, 11:02 AM
Thread Re: New SQL version
pugdog 3882 Jul 8, 2000, 12:09 PM
Thread Re: New SQL version
jsu 3845 Jul 8, 2000, 6:06 PM
Post Re: New SQL version
pugdog 3835 Jul 9, 2000, 11:19 AM