Gossamer Forum
Home : Products : Gossamer Links : Version 1.x :

MOD: Search Only This Category

Quote Reply
MOD: Search Only This Category
Search Only This Category Mod
--------------------------------------
This mod was written with the guidance, input and help of the following
people; AnthroRules, pugdog, and especially Alex404.

The mod is only useful if you have a search field/box on your category
pages and you make the necessary changes to the category template.

Note: I have made modifications to my own files, so some of the code
may appear slightly different in your implementation of links, but you
should be able to make the necessary changes as outlined here.

Backup
--------------------
Make a backup copy of the following files:

/search.cgi
/admin/Links/Search.pm
/admin/Links/DB_Utils.pm
/admin/templates/category.html


search.cgi
--------------------
Add the following variables to the top at the 'my' statement:
Code:
sub search {
# ---------------------------------------------------
# Performs the actual search.
#
my ($in, $dynamic) = @_;
my ($mh, $bool, $nh, $ww, $order, $query, $ignored, %seen, $next, $catid, $sub_list, $cat_name,
$catdb, $cat_hits, $category_results, $cat_count, $cat_errors, $linkdb,
$link_hits, $link_results, $link_count, $link_errors, $banner_html);
my %in = %{&cgi_to_hash($in)};
After ...

# Get/Set the search options.
($in->param('mh') =~ /^(10|25|50|100)$/) ? ($mh = $1) : ($mh = 25);
($in->param('bool') =~ /^(and|or)$/i) ? ($bool = uc $1) : ($bool = 'OR');
($in->param('nh') =~ /^(\d+)$/) ? ($nh = $1) : ($nh = 1);
($in->param('substring')) ? ($ww = 1) : ($ww = 0);
($in->param('order') =~ /^(score|category)$/i) ? ($order = uc $1) : ($order = 'CATEGORY');

... add the following ...

Code:

$catid = $in->param('catid');

# Get a list of catgeory ID's for 'only this cat' type of search

if ($catid eq ""){$cat_name=""} else {

# Get Category Name
$cat_name = &get_category_name ($catid);

# Get a list of category IDs where the name starts with $cat_name
$sub_list = &get_category_id_list ($cat_name);
chop($sub_list);
$in->param ( 'CategoryID' => $sub_list );
}
At the top of '# Search the category listings' add the following:

# Search the category listings
my %catfilter = ();
%catfilter->{'ID'}=$sub_list;


... and change $cat_hits ... to ...

$cat_hits = $catdb->query ( { query => $query, mh => $mh, nh => $nh, ww => $ww, filter => \%catfilter } );


Search.pm
--------------------
Go to sub query { and look for # ... setup the filter if it was asked for

Change it to ...
Code:

# ... setup the filter if it was asked for
my $filterref = $self->{filter};
my $filter = undef;
my $where ;

if ( ($filterref) and (@query_results)) {
$filter = " and (";
foreach my $flt(keys %$filterref){
if ($filterref->{$flt} !~ m/,/){
$filter.= "(" . $flt . " like '%" . $filterref->{$flt} . "%') and ";
} else {
$filter.= "(" .$flt . " in (" . $filterref->{$flt} . ")) and ";
}
}
chop($filter);chop($filter);chop($filter);chop($filter);
$filter .= ")";
Now go a little further down to # ... since there was a change in the number of results AND there ...
and change the following section from ...

Code:
my %needed = map { $$_[0] => 1 } @query_results;
foreach $row ( $sth->fetchrow_hashref () ) {
delete $needed { $$row { $id_col } };
}
... to ...

Code:
my %needed = map { $$_[0] => 1 } @query_results;
while ( $row = $sth->fetchrow_hashref () ) {
delete $needed { $$row { $id_col } };
}
DB_Utils.pm
--------------------
At the top of the file a few lines after '@EXPORT = ' find this line ...

&build_category_row &build_category_row_val &build_categoryd_row

... and change it to this ...

&build_category_row &build_category_row_val &build_categoryd_row &get_category_id_list


Next we add the new routine after ...

sub get_category_list {
# --------------------------------------------------------
# Builds a <select> list of category name to category id.
#
...
$sth->finish;
$CATEGORY_LIST{$value,$fname} = $output;
}
return $output;
}


... add this new routine ...

Code:

sub get_category_id_list {
# --------------------------------------------------------
# Get a list of category IDs based on match with start of category name.
#
my $value = shift;
my ($query, $sth, $id, $name, $output);

$output = $CATEGORY_LIST{$value};

if (!$output) {
if (! $CATDB) {
$CATDB = new Links::DBSQL $LINKS{admin_root_path} . "/defs/Category.def";
}
$query = qq!
SELECT ID
FROM Category
WHERE Name LIKE '$value%'
!;
$sth = $CATDB->prepare ($query);
$sth->execute() or die "Can't Execute: $DBI::errstr";
while (($id) = $sth->fetchrow_array) {
($output .= "$id,");
}
$sth->finish;
$CATEGORY_LIST{$value} = $output;
}
return $output;
}
That's all the routines and variables we'll need to adjust and
modify, so now we have to put the CategoryID variable into the
category page so we can tell links which category the search
has come from.


Template category.html
--------------------
In your category template file you will have the 'FORM' details
for the search.

We now need to add the radio buttons to allow our visitor to
search through 'All categories' or 'Only this category'.

Look for the HTML that displays the SEARCH button ...


<input type=submit value="Search">


Directly after this add the following ...


<INPUT NAME="catid" TYPE="radio" VALUE="" CHECKED>All categories
<INPUT NAME="catid" TYPE="radio" VALUE="<%category_id%>">Only this category


This will put two 'radio' type buttons on your category page next
to the search box, with the options for 'All categories'
highlighted by default.


Re-build
--------------------
Upload the modified files and make sure all the file permissions are
correct, then re-build your directory to effect the changes.

Go to one of your category pages and make sure the search is
working correctly.

If you have any problems use your backup files to put your site back
into working order then double-check the changes you made to
trouble-shoot the problem or problems.

Thats it - hope you enjoy Smile

All the best
Shaun

Quote Reply
Re: MOD: Search Only This Category In reply to
Thanks, Shaun...and good job...I will see if it works...I am still pretty green when it comes to the inter-workings of the DBSQL and Search modules...but I hope it works...because I attempted to use the codes provided by Alex404 to make the boolean search work and it doesn't.

Regards,

Eliot Lee

Quote Reply
Re: MOD: Search Only This Category In reply to
Pugdog,

If this works OK for a few people could it replace the link of Oct-99 in your FAQs?

http://postcards.com/...n/FAQ/jump.cgi?ID=22

All the best
Shaun

Quote Reply
Re: MOD: Search Only This Category In reply to
Hi Qango,

Looks good! A couple comments. Your get_category_id_list will not work under mod_perl, as you will obliterate the select lists hash. Also, it could be optimized by removing the ORDER BY which isn't neccessary. Try:

Code:
sub get_category_list {
# --------------------------------------------------------
# Builds a <select> list of category name to category id.
#
my $value = shift;
my ($query, $sth, $id);
if (! $CATDB) {
$CATDB = new Links::DBSQL $LINKS{admin_root_path} . "/defs/Category.def";
}
$query = qq!
SELECT ID
FROM Category
WHERE Name Like '$value%'
!;
$sth = $CATDB->prepare ($query);
$sth->execute() or die "Can't Execute: $DBI::errstr";
while (($id) = $sth->fetchrow_array) {
$output .= $id . ",";
}
chop $output;
return $output;
}
Other then that, looks good! We will be including this in 1.13 (will slightly different approach, but same results) tommorrow.

Cheers,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: MOD: Search Only This Category In reply to
How about adding the link to the FAQ site, and letting me see if it actually works to let people add things <G>

As for the inclusion in 1.13, this is why I've left some code segments alone.... I know that once the noise level rises, and enough ideas are tossed out, a fix will be forthcoming. In the next version, the logic changes so these fixes won't carry though.

Eliot -- as for mucking around in DBSQL.pm, it's just a perl module, like any other. Each subroutine is a separate method for the DBSQL object (an interface to the DBI/DBD module). You can usually safely make changes to segments of the file and lines of code if you are changing local variables with a subroutine and not trying to change any default behaviour (input/output) of the modules (just fix, expand, or alter what's going on). No one's done anything to bust the module yet <G>

The risk of mucking around in DBSQL.pm, is like 'undocumented features' of a program or OS, only the published I/O interface will remain stable from version to version, so upgrades will most likely be incompatible with those changes. Some could be re-coded, others might not be able to be.

So, while making debuged changes to the DBSQL.pm won't do any harm, it won't be portable up through the versions. Doing your own modifications you need to make sure that you don't do anything non mod_perlish (or don't use mod_perl) and DOCUMENT!! any changes so you can remember why any 'upgrade' version doesn't do what you expect it to <G>

One good idea is any mod you make to a file use a 'code' you can grep. I use

' ## ' string to mark my changes to a file -- bug fixes, code tweaks, things that won't carry over between major versions, and might not carry over in minor versions - or might even be fixed in minor versions.

' #_## ' is a string I use for real logic changes, where I've changed the default behaviour and will have to make code changes in any future versions.

When I grep my DBSQL.pm I find I've made 6 changes to areas of the file, mostly benign <G>. But it beats trying to page through the file to find any changes.



http://www.postcards.com
FAQ: http://www.postcards.com/FAQ/LinkSQL/

Quote Reply
Re: MOD: Search Only This Category In reply to
In Reply To:
So, while making debuged changes to the DBSQL.pm won't do any harm, it won't be portable up through the versions.
That is my biggest fear and that is why I have not dug too deep into the modules (with the exception of DB_UTILS.pl, Admin_HTML.pl, and HTML_Templates.pm, and of course, Links.pm).

In Reply To:
One good idea is any mod you make to a file use a 'code' you can grep. I use

' ## ' string to mark my changes to a file -- bug fixes, code tweaks, things that won't carry over between major versions, and might not carry over in minor versions - or might even be fixed in minor versions.

' #_## ' is a string I use for real logic changes, where I've changed the default behaviour and will have to make code changes in any future versions.
Thanks for the reminder...this is in-grained in my mod behavior...been doing it for a couple years now. :) But, I appreciate the helpful reminder as other newbie programmers may find it helpful.

Regards,

Eliot Lee

Quote Reply
Re: MOD: Search Only This Category In reply to
Not everything is directed at you :) A lot of what I post is to remind people of things that they should do, but might not think of. Some things are second nature to seasoned programmers (usually from the school of hard knocks) <G>

http://www.postcards.com
FAQ: http://www.postcards.com/FAQ/LinkSQL/

Quote Reply
Re: MOD: Search Only This Category In reply to
Sorry...I assumed since you included my name in your previous reply...it was directed at me.

Regards,

Eliot Lee

Quote Reply
Re: MOD: Search Only This Category In reply to
Hi Eliot,

In Reply To:
...because I attempted to use the codes provided by Alex404 to make the boolean search work and it doesn't.
Mhmm..

I assume that http://vlib.anthrotech.com/ is your site ??

I did a search for "anthro egypt" with or and substring
link results: 461 (all containing either anthro or egypt)

and

I did a search for "anthro egypt" with and and substring
link results: 2 (all containing anthro and egypt)

???

regards, alexander

Quote Reply
Re: MOD: Search Only This Category In reply to
I guess it depends on the keywords you type in...the results I received using the same keywords you did via the advanced search form was:

anthro egypt with and = 141
anthro egypt with or = No Search Results

with the normal search box, I get 141 search results.

However, if I search for medical anthropology, I get 35 search results no matter what form I use or boolean search option...

Again...I guess I will have to play around with it a bit or download the Links 1.3 version and see if the search works.

But I appreciate your continued support and assistance.

Regards,

Eliot Lee

Quote Reply
Re: MOD: Search Only This Category In reply to
After following your Mod, I still can seem to get Alternate and Regular Search to work on the Subcategory pages.
My new site is located at http://www.ohiobiz.com.

Please advise.

Thanks a lot.


Mark G.

Quote Reply
Re: MOD: Search Only This Category In reply to
i happened to make this mod awhile back..

it is posted in this forum as a links sql 1.0 mod.. but it works with 1.1

its pretty much the same as your mod since it uses WHERE CategoryID IN (#|#|#|#|#|#|#|#)..

Jerry Su
widgetz sucks
Quote Reply
Re: MOD: Search Only This Category In reply to
Great Mod, qango...with one exception...I've noticed that my search times have increased tenfold...meaning that without the category search, it was taking between 3 and 6 wallseconds, and now it takes between 33 and 40 wallseconds, and of course, the CPU usage has increased from 15% to 65% on average.

I did install v.1.13 that included the "search category" codes in the DBSQL.pm and Search.pm modules, and I have attempted to de-bug the script as much as possible.

I am wondering whether you or other users using this modification have experienced degradation in search times and CPU usage. Also, if Alex or you could provide suggestions for what I should be looking for in the modules or search.cgi to address this problem, I would greatly appreciate it.

Thanks in advance.

Regards,

Eliot Lee
Quote Reply
Re: MOD: Search Only This Category In reply to
Eliot,

Is the lag apparent for EVERY search, or just the 'this category only' ones?

If it's just 'this category only' then it's just one of the thing's that's inherent in searching a large group of sub-cats. The mod basically takes the current category and creates a list of ALL it's sub-cat ID's (this is probably part of what creates the lag). This is the only real way of ensuring that the returned results originate from 'this category only'.

It takes the current category name and does a WHERE ... LIKE ... on the full name, thereby getting all the subcat's that belong to it (the children), i.e.;

/Business --->

/Business/News
/Business/News/Financial
/Business/Consultants
/Business/Resources
... etc.

It probably also takes longer for top-level cat's than it does for lower-level one's since there'll be a lot more sub-cats from a top-level category (each level further down should lead to a slightly faster search.)

I'm not really sure it can be improved upon since you need to have a complete list of the category's sub-cats (children) to compare the matching results against.

If it's doing this for every search, then there's something wrong. I did put a 'check' in the top of search.cgi to only get the sub-cat's list if it was required:

if ($catid eq ""){$cat_name=""} else {

... so it shouldn't do it EVERY time!

I know this probably doesn't help much, but at least it explains why there is a lag in the 'only this category' search, and if you do find a more efficient way of doing it, please let me know. Smile


All the best
Shaun
Quote Reply
Re: MOD: Search Only This Category In reply to
Hey qango...

Thanks for the explanation. The most degradation occurs with the CATEGORY search, however, even doing a "regular" search has increased the search time as well...

What I will play around is creating a "SECONDARY INDEX" of the categories/subcategories, basically creating a secondary table called CatSubCat that contains the PARENT IDs of the TOP level of Categories and then CHILD IDs of subcategories, then index both of those fields and connect that table with the Categories table to get the other indexed information....

Thanks for the pointer...

Regards,

Eliot Lee
Quote Reply
Re: MOD: Search Only This Category In reply to
UPDATE: Welp, I narrowed one of the problems to the &get_category_id_list sub you are using. I replaced it with the sub that Alex wrote later in the Thread.

The CPU know is ranging between 20.0% (normal searches) and 40.0% (category searches). This is less than before, but still about 5-6 times more than when I did not install the category search codes.

I will play around with my INDEX weights and also work on creating a SECONDARY INDEX for the category and link attributes.

Regards,

Eliot Lee
Quote Reply
Re: MOD: Search Only This Category In reply to
What about indexing all the category and sub-category IDs in a database, and then just pulling the IDs directly from a table. That way some lag will be reduced from cutting out the creation of the ID list.

Just a thought,

joe