Gossamer Forum
Home : Products : Gossamer Links : Development, Plugins and Globals :

Re: [webslicer] DMOZ Wizard

Quote Reply
Re: [webslicer] DMOZ Wizard In reply to
>> Just skipping the record all the time is not good, as there is no log file made of skipped records.
>> Just adding it will wipe out existing data.

I'm really missing something here. We are not looking at a spider seeking out new, unknown links. These are links that are already imported in your database, from DMOZ, and links not yet imported.

Skipped records already exist, and are already in your database, so what is the issue? You want to be notified that the record you got from DMOZ is in DMOZ? If you do a 5,000 record import, and there is one new record, you'll have 4,999 "skipped" records, that are pretty pointless.

If you are looking for changed records, that is a double edged sword. Also, DMOZ drops good links for all sorts of reasons, and keeps bad ones around way longer than they should. You are better running the "Validate Links" and keeping your own record of still-live sites. It will be more accurate than DMOZ which _rarely_ seems to prune dead links. At least not as often as it should.


I'm pretty hard on this kind of tool, and I've been beating it to death with loads of imports the past 2-3 weeks. I really don't see what you are asking for.

If you have something specific in mind, for your needs, maybe you need a custom job. But, I don't see what added functionality you are trying to get.


The only thing I can see in all this, is something I've wanted for pruning duplicate links.

1) the database runs, and checks if the link (URL) exists in the database. If it does, it checks the Title & Description to see if it matches.
2) if they match, it's skipped (in the duplicate database, it's deleted, after a cat_links addition). At most, keep a count of skipped links, there is _no_ point in doing anything else.
3) if they don't match, insert the link into a duplicates database.
4) if the link exists, but the category is different, a) ignore b) add a cat_link record if the category exists c) add to a suggestion database for adding a catlinks record, or creating a new title if the category doesn't.

This adds a slight bit of functionality that I just haven't been able to allocate the time for. The current tools allow you to do all this, just not in a simple/integrated manner.


PUGDOG� Enterprises, Inc.

The best way to contact me is to NOT use Email.
Please leave a PM here.
Subject Author Views Date
Thread DMOZ Wizard ccjong 9805 Mar 10, 2004, 4:57 PM
Thread Re: [ccjong] DMOZ Wizard
Andy 9721 Mar 11, 2004, 12:28 AM
Thread Re: [Andy] DMOZ Wizard
ccjong 9669 Mar 11, 2004, 1:57 AM
Thread Re: [ccjong] DMOZ Wizard
Andy 9662 Mar 11, 2004, 2:54 AM
Thread Re: [Andy] DMOZ Wizard
pugdog 9650 Mar 14, 2004, 6:11 PM
Thread Re: [pugdog] DMOZ Wizard
Andy 9613 Mar 14, 2004, 11:15 PM
Thread Re: [Andy] DMOZ Wizard
rh 9588 Mar 21, 2004, 4:33 PM
Post Re: [rh] DMOZ Wizard
Andy 9586 Mar 22, 2004, 12:07 AM
Thread Re: [pugdog] DMOZ Wizard
pugdog 9583 Mar 25, 2004, 8:18 PM
Thread Re: [pugdog] DMOZ Wizard
KevM 9520 Mar 26, 2004, 3:47 PM
Thread Re: [KevM] DMOZ Wizard
pugdog 9525 Mar 26, 2004, 4:54 PM
Thread Re: [pugdog] DMOZ Wizard
Andy 9498 Mar 27, 2004, 1:57 AM
Thread Re: [Andy] DMOZ Wizard
pugdog 9510 Mar 27, 2004, 11:18 AM
Thread Re: [pugdog] DMOZ Wizard
webslicer 9476 Mar 28, 2004, 9:33 AM
Post Re: [webslicer] DMOZ Wizard
Andy 9464 Mar 29, 2004, 1:08 AM
Thread Re: [webslicer] DMOZ Wizard
pugdog 9458 Mar 29, 2004, 6:18 AM
Thread Re: [pugdog] DMOZ Wizard
webslicer 9476 Mar 29, 2004, 7:53 AM
Post Re: [webslicer] DMOZ Wizard
Andy 9440 Mar 29, 2004, 9:04 AM
Thread Re: [webslicer] DMOZ Wizard
pugdog 9448 Mar 29, 2004, 9:07 AM
Thread Re: [pugdog] DMOZ Wizard
webslicer 9447 Mar 29, 2004, 11:53 AM
Post Re: [webslicer] DMOZ Wizard
pugdog 9424 Mar 29, 2004, 7:19 PM