Gossamer Forum
Home : Products : Gossamer Links : Development, Plugins and Globals :

Re: [pugdog] Using INNER JOIN to duplicate CatLinks

Quote Reply
Re: [pugdog] Using INNER JOIN to duplicate CatLinks In reply to
Thanks pugdog,

I did get the "At category and below" working.
I ended up using IN just like you have suggested, but for categories only.
When I tried to using a list of zips as "IN" while calculating distances, it was too slow.

So what ended up working for me was eliminating some of the INNER JOINs,
and having the distance calculation do ONLY the distance calc, and return just
the ID's of the links (I was returning the entire link record).

On a business listing directory with 150,000 links, I get the following results.
Searching within 20m of 60611 (chicago) returns 8000 matches, sorted by distance
in 8-10 seconds (no mod-perl). I am fairly happy with the results.

Here's my current code that seems to work very well. I do think I am getting better at this :-)
Code:
# Start with: Distance
my $sort_order = $args->{so}; # Add ASC or DESC: Distance ASC
$sort_order .= ','; # Add comma : Distance ASC,
$sort_order .= $args->{sb}; # Add other Columns: Distance ASC, Title DESC, Add_Date

my $earth_radius = "3957"; # in Miles
if ($unitcode eq "K") {
$earth_radius = $earth_radius * 1.609344; # in Kilometers (6378)
} elsif ($unitcode eq "N") {
$earth_radius = $earth_radius * 0.86897; # in Nautical Miles (3439)
}

my $code;

if ($catid) {

# This is just the short category name for display on template results
$catname = $categories_db->select( ['Name'], { ID => $catid } )->fetchrow;

# We want all links in catid and below. Found at GT forums under <%load_lower_links($ID)%>
my $catname_full = $categories_db->select( ['Full_Name'], { ID => $catid } )->fetchrow;
my $cond = GT::SQL::Condition->new(
Full_Name => '=' => "$catname_full",
Full_Name => 'LIKE' => "$catname_full/%"
);
$cond->bool('OR');
my $sth = $categories_db->select('ID', $cond ) || die $GT::SQL::error;

# Create a list of CategoryID (s) to use as IN below
my $results = $sth->fetchall_hashref('ID');
my @cats = map { $_->{ID} } @$results;
my $cat_list = join " ", @cats;
$cat_list =~ s/ /,/g;

$code = "
SELECT ".$DB->prefix."Links.ID,
(".$earth_radius." * (2 * ASIN(SQRT(
POWER(SIN(((z.ZCS_Latitude-o.ZCS_Latitude)*0.017453293)/2),2) +
COS(z.ZCS_Latitude*0.017453293) *
COS(o.ZCS_Latitude*0.017453293) *
POWER(SIN(((z.ZCS_Longitude-o.ZCS_Longitude)*0.017453293)/2),2)
)))) AS ZCS_Distance

FROM ".$DB->prefix."ZCS_Zips z,
".$DB->prefix."ZCS_Zips o

INNER JOIN ".$DB->prefix."Links ON o.ZCS_ZipCode = ".$DB->prefix."Links.".$opts->{'links_zipcode_col'}."
INNER JOIN ".$DB->prefix."CatLinks ON ".$DB->prefix."CatLinks.LinkID = ".$DB->prefix."Links.ID

WHERE z.ZCS_ZipCode = ".$query." AND ".$DB->prefix."CatLinks.CategoryID IN (".$cat_list.")".$and."
GROUP BY ZCS_Distance SORT_ORDER, ZCS_Distance having ZCS_Distance <= ".$dist
;
}
# If we are not limiting to categories, we can drop one of the INNER JOINs
else {

$code = "
SELECT ".$DB->prefix."Links.ID,
(".$earth_radius." * (2 * ASIN(SQRT(
POWER(SIN(((z.ZCS_Latitude-o.ZCS_Latitude)*0.017453293)/2),2) +
COS(z.ZCS_Latitude*0.017453293) *
COS(o.ZCS_Latitude*0.017453293) *
POWER(SIN(((z.ZCS_Longitude-o.ZCS_Longitude)*0.017453293)/2),2)
)))) AS ZCS_Distance

FROM ".$DB->prefix."ZCS_Zips z,
".$DB->prefix."ZCS_Zips o

INNER JOIN ".$DB->prefix."Links ON o.ZCS_ZipCode = ".$DB->prefix."Links.".$opts->{'links_zipcode_col'}."

WHERE z.ZCS_ZipCode = ".$query. $and."
GROUP BY ZCS_Distance SORT_ORDER, ZCS_Distance having ZCS_Distance <= ".$dist
;
}

$code =~ s/SORT_ORDER/$sort_order/;

my $offset = ($args->{nh} - 1) * $args->{mh};
my $sth = $links_db->prepare($code) || return { error => "Unable to PREPARE query - ".$GT::SQL::error };
$sth->execute () || return { error => "Unable to EXECUTE query - ".$GT::SQL::error };

while (my $row = $sth->fetchrow_hashref)
{
if($link_count >= $offset and ( $link_count < ($offset + $args->{mh})) ) {

# Grab the entire link and zipcode records. All I got from SELECT above is link id.
my $link = $links_db->select ('*', { ID => $row->{ID} })->fetchrow_hashref;
my $zip = $zipcode_db->select ('*', { ZCS_ZipCode => $link->{$opts->{'links_zipcode_col'}} })->fetchrow_hashref;

$link = Links::SiteHTML::tags('link', $link);

# ZCS_Location tag - Just a combination of City, State, CC and Zip
$link->{ZCS_Location} = $zip->{ZCS_City} . ", " . $zip->{ZCS_State} . " (" . $zip->{ZCS_CC} . ") " . $zip->{ZCS_ZipCode};

# Grab the full country name from it's code to use as ZCS_Country tag.
# Do this INSTEAD of the 2nd or 3rd INNER JOIN in the SELECT statement above.
# This is waaay faster than a 3 table join. This method is also used by GetLocationTags.
$link->{ZCS_Country} = $DB->table('ZCS_Countries')->select('ZCS_Country', { ZCS_CC => $zip->{ZCS_CC} })->fetchrow;

if ($row->{ZCS_Distance} == 0) {
$link->{ZCS_Distance} = 0;
}
else {
$link->{ZCS_Distance} = sprintf("%.2f", $row->{ZCS_Distance}); # Round distance to 2 decimal places
}
push (@link_results_loop, $link);
}
$link_count++;
}

I was able to keep my zipcodes/lat/long and all other city/state/country info in separate tables.
When cycling through the results, I then grab the entire link record, and city/state/country country
info, but only on the results that are being displayed.

If I knew how to cache results, it could get even faster.

What do you think of my solution/code?
It even provides for additional "and" conditions by simply adding
a Links field to the ZipCode search form.

Thanks,
Chris
RGB World, Inc. - Software &amp; Web Development.
rgbworld.com

Last edited by:

rgbworld: Jul 21, 2006, 11:20 AM
Subject Author Views Date
Thread Using INNER JOIN to duplicate CatLinks rgbworld 3135 Jul 6, 2006, 11:16 AM
Thread Re: [rgbworld] Using INNER JOIN to duplicate CatLinks
pugdog 3005 Jul 20, 2006, 9:44 PM
Post Re: [pugdog] Using INNER JOIN to duplicate CatLinks
rgbworld 2971 Jul 21, 2006, 11:18 AM
Post Re: [pugdog] Using INNER JOIN to duplicate CatLinks
rgbworld 2968 Jul 21, 2006, 11:39 AM