Gossamer Forum
Home : Products : Gossamer Links : Discussions :

Duplicate Listing Removal from Search Results Suggestion

Quote Reply
Duplicate Listing Removal from Search Results Suggestion
Hi All,

I am new to LinksSQL but I already love it. Been programming in PERL for a while but on this project decided to buy instead of build and I have to say I am not sorry.

I am allowing multiple listings of the same URL in different categories but I don't want a matching keyword search to report that URL more than once. I am familiar with modifying the directory using global subs but think I may need a plugin for this one.

My initial thought was to create a search_results plugin that would take the output and for each link do something like this with a %seen variable:

defined $seen{URL} ? ($seen{URL} += 1) : ($seen{URL} = 1);

After this command if the value is >1 then just move on, dont print results. The reason for incrementing the value is so that I can adjust links_hits accordingly before printing results.

Does this sound like the best way to do this? I looked into link_results_loop in search_results.html but I don't think I can do everything I want there. Any feedback is appreciated.

Jason
Quote Reply
Re: [shunsoft] Duplicate Listing Removal from Search Results Suggestion In reply to
Yeah...your best bet would be to add a search_results POST hook, and then grab $SEARCHED and edit it as to your needs. The main problem is going to be getting even numbers per page. Simply modifying $SEARCHED will probably give uneven numbers. Also, when going onto the next page it would be hard not to get duplicate links to show up again...

Anyway, just throwing a few ideas into your head Smile

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [shunsoft] Duplicate Listing Removal from Search Results Suggestion In reply to
Well I ended up actually getting this done with a global sub. The more I figure out LinksSQL the more I like it. For anyone trying to make sure a given URL only shows up once in search results here is what I did:

I added a global sub called remove_dupes:

--------CODE---------
sub {
my $vals = shift;
my @unique;
my %seen;
my $linfo;
while($linfo = shift @{$vals->{link_results_loop}}) {
if(!defined $seen{$linfo->{'URL'}}) {
push(@unique,$linfo);
$seen{$linfo->{'URL'}} = 1;
} else {
$seen{$linfo->{'URL'}} += 1;
$vals->{link_hits}--;
}
}
while($linfo = shift @unique) {
push(@{$vals->{link_results_loop}},$linfo);
}
return '';
}
--------END_CODE---------

the $seen{$linfo->{'URL'}} += 1; line just keeps a track of the # of dupes and isn't used at all. it can be removed.

then in my search_results.html i replaced <%link_results%> with:


<%remove_dupes%>
<%body_font%>
Your search for <b><%query%></b> returned <b><%link_hits%></b> Links.</font>
<%loop link_results_loop%>
<%include link.html%>
<%endloop%>

I also removed the link count form the top of the page and moved it down here so it could be updated with the new link count. You could just as easily put <%remove_dupes%> at the start of search_results.html and then it would be updated for the first line.

This worked great for me.

Drawbacks: 1. The nice category separated format of <%link_results%> is lost. You can probably re-create it with some fancy loop work. 2. It just uses the first occurrence of the URL so you don't really get to pick which results you remove.

All I wanted to do was make sure that a given URL only showed up once on my search results and this did the trick without a plugin. Does this look like an OK solution?

Thanks,
Jason