Gossamer Forum
Home : Products : Gossamer Links : Development, Plugins and Globals :

Re: [webslicer] DMOZ Wizard

Quote Reply
Re: [webslicer] DMOZ Wizard In reply to
>> Just skipping the record all the time is not good, as there is no log file made of skipped records.
>> Just adding it will wipe out existing data.

I'm really missing something here. We are not looking at a spider seeking out new, unknown links. These are links that are already imported in your database, from DMOZ, and links not yet imported.

Skipped records already exist, and are already in your database, so what is the issue? You want to be notified that the record you got from DMOZ is in DMOZ? If you do a 5,000 record import, and there is one new record, you'll have 4,999 "skipped" records, that are pretty pointless.

If you are looking for changed records, that is a double edged sword. Also, DMOZ drops good links for all sorts of reasons, and keeps bad ones around way longer than they should. You are better running the "Validate Links" and keeping your own record of still-live sites. It will be more accurate than DMOZ which _rarely_ seems to prune dead links. At least not as often as it should.


I'm pretty hard on this kind of tool, and I've been beating it to death with loads of imports the past 2-3 weeks. I really don't see what you are asking for.

If you have something specific in mind, for your needs, maybe you need a custom job. But, I don't see what added functionality you are trying to get.


The only thing I can see in all this, is something I've wanted for pruning duplicate links.

1) the database runs, and checks if the link (URL) exists in the database. If it does, it checks the Title & Description to see if it matches.
2) if they match, it's skipped (in the duplicate database, it's deleted, after a cat_links addition). At most, keep a count of skipped links, there is _no_ point in doing anything else.
3) if they don't match, insert the link into a duplicates database.
4) if the link exists, but the category is different, a) ignore b) add a cat_link record if the category exists c) add to a suggestion database for adding a catlinks record, or creating a new title if the category doesn't.

This adds a slight bit of functionality that I just haven't been able to allocate the time for. The current tools allow you to do all this, just not in a simple/integrated manner.


PUGDOG� Enterprises, Inc.

The best way to contact me is to NOT use Email.
Please leave a PM here.
Subject Author Views Date
Thread DMOZ Wizard ccjong 9736 Mar 10, 2004, 4:57 PM
Thread Re: [ccjong] DMOZ Wizard
Andy 9652 Mar 11, 2004, 12:28 AM
Thread Re: [Andy] DMOZ Wizard
ccjong 9600 Mar 11, 2004, 1:57 AM
Thread Re: [ccjong] DMOZ Wizard
Andy 9593 Mar 11, 2004, 2:54 AM
Thread Re: [Andy] DMOZ Wizard
pugdog 9581 Mar 14, 2004, 6:11 PM
Thread Re: [pugdog] DMOZ Wizard
Andy 9544 Mar 14, 2004, 11:15 PM
Thread Re: [Andy] DMOZ Wizard
rh 9519 Mar 21, 2004, 4:33 PM
Post Re: [rh] DMOZ Wizard
Andy 9517 Mar 22, 2004, 12:07 AM
Thread Re: [pugdog] DMOZ Wizard
pugdog 9514 Mar 25, 2004, 8:18 PM
Thread Re: [pugdog] DMOZ Wizard
KevM 9451 Mar 26, 2004, 3:47 PM
Thread Re: [KevM] DMOZ Wizard
pugdog 9456 Mar 26, 2004, 4:54 PM
Thread Re: [pugdog] DMOZ Wizard
Andy 9429 Mar 27, 2004, 1:57 AM
Thread Re: [Andy] DMOZ Wizard
pugdog 9441 Mar 27, 2004, 11:18 AM
Thread Re: [pugdog] DMOZ Wizard
webslicer 9407 Mar 28, 2004, 9:33 AM
Post Re: [webslicer] DMOZ Wizard
Andy 9395 Mar 29, 2004, 1:08 AM
Thread Re: [webslicer] DMOZ Wizard
pugdog 9389 Mar 29, 2004, 6:18 AM
Thread Re: [pugdog] DMOZ Wizard
webslicer 9407 Mar 29, 2004, 7:53 AM
Post Re: [webslicer] DMOZ Wizard
Andy 9370 Mar 29, 2004, 9:04 AM
Thread Re: [webslicer] DMOZ Wizard
pugdog 9378 Mar 29, 2004, 9:07 AM
Thread Re: [pugdog] DMOZ Wizard
webslicer 9378 Mar 29, 2004, 11:53 AM
Post Re: [webslicer] DMOZ Wizard
pugdog 9355 Mar 29, 2004, 7:19 PM