Gossamer Forum
Home : Products : Gossamer Links : Discussions :

2,000,000 plus duplicates

Quote Reply
2,000,000 plus duplicates
Due to my SSH connection timing out during two imports of the the two largest dmoz categories (and having to start over) I now have 2,000,000 plus duplicates.

I was under the impression that duplicates could be automatically deleted, but now I see, that I have to mark a check box 2 million times.

Ah...there must be a better way.

I could do this through a sql statement fairly easily, however, I wanted to do this through links sql 2 as I assume it takes care of certain things behind the scenes that I might miss.

In the event I have to do this manually, are there any "gotcha's", I have to watch out for? In particular, which tables need to be dropped / truncated?

Keep in mind that my "Browse" selection doesn't work in the Admin panel.

Code is always appreciated!
Quote Reply
Re: [takacsj] 2,000,000 plus duplicates In reply to
Sounds like you missed out --rdf-update in your query Wink That would have stopped the duplicates.

Unfortunatly, the only way I know to get rid of them all, is to totally clear out the links tables, and start the DMOZ import from scratch :|


Andy (mod)
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] 2,000,000 plus duplicates In reply to
If there is SOME field of similarity between the ones you imported second (or first), you could run a MySQL statement like:

Delete from Links where Add_Date > '2003-05-16'

BE VERY CAREFUL! Often, when I run big delete statements like that, I will run a "Select *" first with the same where statment to make sure there are no surprises!

Big Cartoon DataBase
Big Comic Book DataBase