Hi there,
I've got a LinksSQL installation using DMOZ data from a fairly large category (over 300K links). Since I've done more than one import/update, I'm noticing that there are many (i.e. 1000s) cases where the same link (sometimes with a slightly different title) exists more than once in the same category. I have no problem with duplicate urls existing in different categories, but the same link should never appear more than once in the same category.
I tried using the "check duplicates" tool in the admin panel, but it would take hours and hours to go through it all manually.
To cut to the chase... I've been trying to come up with an SQL query that will simply go through and find every time the same url exists more than once in the same category and delete all but one of those links (preferably leaving only the one with the highest link ID). Easier said than done. Has anyone dealt with this before? Is there a way to do it with an SQL query, or do I need to write a script to do it for me?
Thanks for any ideas.
Fractured Atlas :: Liberate the Artist
Services: Healthcare, Fiscal Sponsorship, Marketing, Education, The Emerging Artists Fund
I've got a LinksSQL installation using DMOZ data from a fairly large category (over 300K links). Since I've done more than one import/update, I'm noticing that there are many (i.e. 1000s) cases where the same link (sometimes with a slightly different title) exists more than once in the same category. I have no problem with duplicate urls existing in different categories, but the same link should never appear more than once in the same category.
I tried using the "check duplicates" tool in the admin panel, but it would take hours and hours to go through it all manually.
To cut to the chase... I've been trying to come up with an SQL query that will simply go through and find every time the same url exists more than once in the same category and delete all but one of those links (preferably leaving only the one with the highest link ID). Easier said than done. Has anyone dealt with this before? Is there a way to do it with an SQL query, or do I need to write a script to do it for me?
Thanks for any ideas.
Fractured Atlas :: Liberate the Artist
Services: Healthcare, Fiscal Sponsorship, Marketing, Education, The Emerging Artists Fund