Gossamer Forum
Home : Products : DBMan : Customization :

Boolean MOD

Quote Reply
Boolean MOD
Hello!

Thanks to Alex, I have a possible MOD for DBMan 2.04 that will allow boolean AND/OR searching based on a flag bool=and or bool=or. If bool is not defined, then it performs a normal keyword search. But, I could use some help debugging it.

This code goes in db.cgi, in the sub query routine. Replace the code after:

# Now let's build up all the regexpressions we will use. This saves the program
# from having to recompile the same regular expression every time.

with:

Quote:
if($in{'bool'}) {
($in{'bool'} eq 'and') ? ($bool = '&&') : ($bool = '&#0124; &#0124;'); # <-- NO space between the &#0124; &#0124;'s

foreach $field (@search_fields) {
$finalreg = '';

foreach my $word (split /\s/, $in{$db_cols[$field]}) {
next if ($word =~ /^\s*$/);
$tmpreg = $word;
(!$in{'re'}) and ($tmpreg = "\Q$tmpreg\E");
($in{'ww'}) and ($tmpreg = "\\b$tmpreg\\b");
(!$in{'cs'}) and ($tmpreg = "(?i)$tmpreg");
$finalreg .= " m/$tmpreg/o $bool";
}
if ($in{$db_cols[$field]} eq "*") {
$finalreg = "m/.*/";
}
else {
chop $finalreg; chop $finalreg;
}
$regexp_func[$field] = eval "sub { $finalreg }";
}
}

else {

foreach $field (@search_fields) {
my $tmpreg = "$in{$db_cols[$field]}";
(!$in{'re'}) and ($tmpreg = "\Q$tmpreg\E");
($in{'ww'}) and ($tmpreg = "^$tmpreg\$");
(!$in{'cs'}) and ($tmpreg = "(?i)$tmpreg");
($in{$db_cols[$field]} eq "*") and ($tmpreg = ".*"); # A "*" matches anything.

$regexp_func[$field] = eval "sub { m/$tmpreg/o }";
$regexp_bold[$field] = $tmpreg;
}

Any help getting this working would be seriously appreciated!!



----------------------------------------
Lee Leisure
HP Customer Relations Manager
----------------------------------------
* NOTE: This message is intended
for personal purposes only, and does
not imply the position or opinion of
the Hewlett Packard Company.









[This message has been edited by leisurelee (edited April 07, 2000).]


[This message has been edited by leisurelee (edited April 07, 2000).]


[This message has been edited by leisurelee (edited April 07, 2000).]
Quote Reply
Re: Boolean MOD In reply to
Lee, I wish I could help. The code you posted is the only part of DBMan I don't understand as yet. (Which is why I don't call myself a Perl programmer. Smile )

I tried doing something that would achieve the same purpose, but it didn't really work either.

Let's hope some *real* Perl programmer can work this out.


------------------
JPD





Quote Reply
Re: Boolean MOD In reply to
[very large message removed. contact me if you need the "alternate boolean script" code!]

[This message has been edited by leisurelee (edited April 25, 2000).]
Quote Reply
Re: Boolean MOD In reply to
Haven't played with your code yet, but I'd think you'd have to also edit this part...
Code:
# Normal searches.
$key_match = 0;
foreach $field (@search_fields) {
$_ = $values[$field]; # Reg function works on $_.
$in{'ma'} ?
($key_match = ($key_match or &{$regexp_func[$field]})) :
(&{$regexp_func[$field]} or next LINE);
}
Seems that keymatch is going to be set on the *first* successful match, and even if the second (or subsequent) match fails it will still pass results because keymatch is not null. I would approach this by setting a tempkeymatch for bool=and, not unlike what you have done with tmpreg vs finalreg... I may have go at this later this week if you don't get a solution sooner.

------------------
The Immuatable Order of Modding
-=-=-=-=-=-=-=-
1. Read the FAQ, 2. Search the board, 2a. Search the board again, 3. ask the question, 4. back-up, 5. experiment, 6. rephrase question (or better yet, post solution to original question)

Quote Reply
Re: Boolean MOD In reply to
oldmoney,

It's been a while since I played with this, so I thought I'd bring this guy back out and see if I can't get it to work.

I think you're right, in that $key_match would in fact be set on the first match and would not care if any other words matched or not. Clearly it needs some comparitor to make sure we are matching all the words in the expression.

However, if that bit of code you noted is edited, wouldn't it mess up any other searches (OR, keyword, reg exp, field search, etc)?

--Lee
Quote Reply
Re: Boolean MOD In reply to
Since you set the variable $bool earlier, quick and dirty solution would be...

if ($bool eq "&&") {
some modified version of the $key_match code
}else {
$key_match code as is...
}


------------------
The Immuatable Order of Modding
-=-=-=-=-=-=-=-
1. Read the FAQ, 2. Search the board, 2a. Search the board again, 3. ask the question, 4. back-up, 5. experiment, 6. rephrase question (or better yet, post solution to original question)

Quote Reply
Re: Boolean MOD In reply to
Hmmm... actually, I kind of like the bool AND behavior you describe. It makes more sense to me (finding values across fields for an AND search seems counter-intuitive), but that's just me. Anyway, I really haven't done much but lob ideas from afar and let you do all the *hard* work... so here's another one...

For the bool AND, is it possible to take the entire record in as a string and search the whole string (as opposed to by each field)? If you ask me how, I'll really have to roll up my sleeves and catch up with what you've already done... Wink

------------------
The Immuatable Order of Modding
-=-=-=-=-=-=-=-
1. Read the FAQ, 2. Search the board, 2a. Search the board again, 3. ask the question, 4. back-up, 5. experiment, 6. rephrase question (or better yet, post solution to original question)

Quote Reply
Re: Boolean MOD In reply to
oldmoney,

Thanks so much for all your advise. It's really helped me think things through.

[snip]

Can you think of why the and search is so restrictive, and how that can be fixed so that as long as all the words are anywhere in the record, it is considered a match?

Thanks again!!

--Lee




[This message has been edited by leisurelee (edited April 25, 2000).]
Quote Reply
Re: Boolean MOD In reply to
Yes well, if you hadn't lobbed ideas out, I would have had a much more difficult time trying to keep my line of thought going in the right direction. <G>

And I certainly don't mind doing the "hard" work, but if you wouldn't mind rolling up your sleeves.... Smile

[snip]

I think your idea is certainly a good one. What if instead of checking each field for a match, it clumped all the fields into a single variable, and searched through that instead? So any words associated at all with this record would be searched. BUT, how would that be accomplished??

Thanks a bunch!

--Lee


[This message has been edited by leisurelee (edited April 25, 2000).]
Quote Reply
Re: Boolean MOD In reply to
Okay. I've conceived a way to plop all fields of a record into a single variable:

Code:
if($in{'bool'}) {
foreach $field (@search_fields) {
$demo_field .= $values[$field];
}
if ($demo_field =~ m/$finalreg/i) {$key_match = 1;}
}

However, I'm totally on the wrong track with the comparator. It comes back no match found.

I could easily make it an "any" or "or" search by using if($demo_field =~ m/word|word|word/i) {whatever} (in theory, not certain about that code), but as it is now, it's literally performing if($demo_field =~ m/word && word && word/i) {$key_match = 1;}. In the reg. exp. comparator, I don't think that'll work. Perhaps we could split the $finalreg? But then how would the actual comparison be made?

My head hurts. LOL. I swore when I started this it would be easy. Wink

Any ideas? Thanks for your help!!!

--Lee
Quote Reply
Re: Boolean MOD In reply to
They way you've started makes sense to me, though it's a little cumbersone (e.g. a real Perl wiz would laugh at the solution I'm about to provide).
Code:
if($in{'bool'}) {
foreach $field (@search_fields) {
$demo_field .= $values[$field];
}
@the_terms = split(/&&/,$finalreg); #you may need to back escape the &'s
foreach $the_term (@the_terms) {
next if ($key_match = 0);
($demo_field =~ m/$the_term/i) ?
$key_match = 1 :
$key_match = 0;
}
}

After each search term is evaluated in the regex match against $demo_field, you should end up with $key_match = 0 or $key_match = 1. Once $key_match is set to zero on a failed match, the rest of the search terms in the foreach should be skipped by the next if, but I haven't checked this.

P.S. What is the significance of Q4? Smile

------------------
The Immuatable Order of Modding
-=-=-=-=-=-=-=-
1. Read the FAQ, 2. Search the board, 2a. Search the board again, 3. ask the question, 4. back-up, 5. experiment, 6. rephrase question (or better yet, post solution to original question)

[This message has been edited by oldmoney (edited April 10, 2000).]
Quote Reply
Re: Boolean MOD In reply to
<grin> Q and 4 were my sponsors. Smile

Your code looks sound; I don't care if it looks funny. <G> Let me give it a try.......

Oops! It matched every record. Smile Back to the drawing board... I have to get to a staff meeting right now, but I'll tear it apart in a bit...

...OK back from staff. One thought: When a match is made normally, is the value of $key_match changed to 1? I've never been able to get the value of $key_match to print out for me. But I have an idea....

I put $key_match in sub html_record, and then I did a regular search, and it appears that setting $key_match to 1 isn't what should be happening.

Instead, when I performed a search, I pulled up record #39, and key_match was blank. All matches show key_match as blank. Not as 0, not as 1... This is odd. I am now lost as to how $key_match is actually set. Any thoughts?

--Lee

[This message has been edited by leisurelee (edited April 11, 2000).]
Quote Reply
Re: Boolean MOD In reply to
[snip]

[This message has been edited by leisurelee (edited April 25, 2000).]
Quote Reply
Re: Boolean MOD In reply to
Lee:

I'm following the dev of this MOD w great interest....

Currently, the limitation is that the match only occurs within each field. Now I was thinking that a possible workaround would be to generate an additional field with all the various $rec in it, an all-in-one field if you will. The MOD will then hit regardless of what field the keywords are in, since it will hit the catch-all field.

I'm trying to figure out a way to hard code db.cgi to generate the catch-all field on the fly.

This should work well for small db, in big db, you are increasing db size to 2X. Is this a problem I don't know.

Library of congress does it similarly with a seperate field of keywords to search against. So this is not a new idea....

Anyways, thanks for the MOD once again, great work!

Dave
Quote Reply
Re: Boolean MOD In reply to
Thanks. Smile Good idea, and runs along the lines oldmoney and I were on.

But you gave me a new idea. If we were to create a new, blank database field, say "boolsearch", and not put anything into it. Then, when sub query is processed, run a quick check:

Code:
if($in{'bool'}) {
foreach $line (@lines) {
$rec{'boolsearch'} .= $line;
}
}

This would take some serious work, as you'd have to pull the database data for each record... You wouldn't increase the database size, but you would theoretically double the length of the search time.

I don't think redoubling all the information into a separate field would work well; it would indeed double the size of the database, and that would be quite an overkill. Smile But, if we can give a value to a record before sending the search pattern, it might just work.

Alas, my talents aren't sufficient to formulate this out myself... or at least, not very quickly. Smile Any assistance would be appreciated.

--Lee
Quote Reply
Re: Boolean MOD In reply to
I have implemented this on one database, and I am impressed with the functionality. I have previously emailed the author with some observations and what I hoped were perceived as constructive ideas.

One other thing I noticed is that when using the Phrase option, results are bolded (if this option is set in .cfg), but when using AND or OR, results are not bolded. I know this is being really, really nitpicky, but I want you to know that I did try to get this to work, have not been successful, but will keep trying.

In the meantime, nice work!

------------------
Alan Pollenz
Quote Reply
Re: Boolean MOD In reply to
alpollenz,

Yes, I concede that the results are not bolded. In the boolean routine, you'll note $regexp_func[$field] = eval "sub { $finalreg }"; while if you do a plain keyword search, then you also define $regexp_bold[$field] = $tmpreg; which, I believe, defines what text is bolded. I have zero clue if this will work at all with boolean, as it is likely the words won't be right next to each other. You can give it a try, by adding $regexp_bold[$field] = $tmpreg; at the bottom of the boolean routine, but no guarantees. Smile

As to your email, I emailed ya back. Keep your eyes on this thread. Any changes or updates will undoubtedly be posted here. And if you have any ideas at all, even if it's just concept ideas on how to do something, by all means POST! Smile This is some pretty hairy stuff!

Thanks for your message!

--Lee
Quote Reply
Re: Boolean MOD In reply to
Am I the only person that is able to work with this code? I feel like I'm wasting my time asking for help when I rarely or never get replies. sub query is a major part of this script; I find it hard to believe no one knows how it works.

I am at the limit of my knowledge. I can't figure out just how this routine works, so I cannot formulate the code necessary to search all fields of a record as if they were all one record. I would appreciate anyone's help on this. Without assistance, there is nothing else I can do with this.


------------------
----------------------------------------
Lee Leisure
HP Customer Relations Manager
----------------------------------------
* NOTE: This message is intended
for personal purposes only, and does
not imply the position or opinion of
the Hewlett Packard Company.


Quote Reply
Re: Boolean MOD In reply to
Lee, I have never been able to wade through much of sub query. (This is why I don't consider myself a Perl programmer. I don't understand what's going on.) I can't follow what you've written so far, so I don't think I can help you go any further.

I'm sorry.


------------------
JPD






Quote Reply
Re: Boolean MOD In reply to
I have cobbled together a way to get the results bolded, although it is far from the optimum way to accomplish this.

First, in sub query, I replaced:

Code:
if($in{'bool'}) {
($in{'bool'} eq 'and') ? ($bool = '&&') : ($bool = '&#0124; &#0124;');

foreach $field (@search_fields) {
$finalreg = '';

foreach my $word (split /\s/, $in{$db_cols[$field]}) {
next if ($word =~ /^\s*$/);
$tmpreg = $word;
(!$in{'re'}) and ($tmpreg = "\Q$tmpreg\E");
($in{'ww'}) and ($tmpreg = "\\b$tmpreg\\b");
(!$in{'cs'}) and ($tmpreg = "(?i)$tmpreg");
$finalreg .= " m/$tmpreg/o $bool";
}
if ($in{$db_cols[$field]} eq "*") {
$finalreg = "m/.*/";
}
else {
chop $finalreg; chop $finalreg;
}
$regexp_func[$field] = eval "sub { $finalreg }";
}
}

with:

Code:
if($in{'bool'}) {
($in{'bool'} eq 'and') ? ($bool = '&&') : ($bool = '&#0124; &#0124;');

$k = 0;
foreach $field (@search_fields) {
$finalreg = '';

foreach my $word (split /\s/, $in{$db_cols[$field]}) {
next if ($word =~ /^\s*$/);
$tmpreg = $word;
(!$in{'re'}) and ($tmpreg = "\Q$tmpreg\E");
($in{'ww'}) and ($tmpreg = "\\b$tmpreg\\b");
(!$in{'cs'}) and ($tmpreg = "(?i)$tmpreg");
$finalreg .= " m/$tmpreg/o $bool";
$k++;
$regexp_bold[$k] = $tmpreg;
}
if ($in{$db_cols[$field]} eq "*") {
$finalreg = "m/.*/";
}
else {
chop $finalreg; chop $finalreg;
}
$regexp_func[$field] = eval "sub { $finalreg }";
}
}

Note the references to $k.

Further down, I replaced:

Code:
# Bold the results
if ($db_bold and $in{'view_records'}) {
for $i (0 .. (($#hits+1) / ($#db_cols+1)) - 1) {
$offset = $i * ($#db_cols+1);
foreach $field (@search_fields) {

$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[$field]),defined($1) ? $1 : "<B>$2</B>",ge;
# $hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[$in{'sb'}]),defined($1) ? $1 : "<B>$2</B>",ge;
}
}
}
return ("ok", @hits);
}

with:

Code:
# Bold the results
if ($db_bold and $in{'view_records'}) {
for $i (0 .. (($#hits+1) / ($#db_cols+1)) - 1) {
$offset = $i * ($#db_cols+1);
foreach $field (@search_fields) {

if($in{'bool'}) {
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[1]),defined($1) ? $1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[2]),defined($1) ? $1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[3]),defined($1) ? $1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[4]),defined($1) ? $1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[5]),defined($1) ? $1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[6]),defined($1) ? $1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[7]),defined($1) ? $1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[8]),defined($1) ? $1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[9]),defined($1) ? $1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[10]),defined($1) ? $1 : "<B>$2</B>",ge;
}
else {
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[$field]),defined($1) ? $1 : "<B>$2</B>",ge;
}
}
}
}

return ("ok", @hits);
}

The part where it is really kludgy is where it has the 10 references to:

Code:
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[1]),defined($1) ? $1 : "<B>$2</B>",ge;

This will accommodate bolding 10 search terms. Of course, you could add more lines here, but the optimum way to do this would be to have a counter that increments, remove 9 of the 10 lines similar to the one above, and change the line above to something like:

Code:
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[$k]),defined($1) ? $1 : "<B>$2</B>",ge;

Just my $.02 worth.

------------------
Alan Pollenz
Quote Reply
Re: Boolean MOD In reply to
Success with <b>BOLD</b> (I think).

In the example I posted just above, replace:

Code:
# Bold the results
if ($db_bold and $in{'view_records'}
) { for $i (0 .. (($#hits+1) /
($#db_cols+1)) - 1) {
$offset = $i * ($#db_cols+1);
foreach $field
(@search_fields) {
if($in{'bool'}) {
$hits[$field + $offset] =~ s,(<[^>]+> )|
($regexp_bold[1]),defined($1) ?
$1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|
($regexp_bold[2]),defined($1) ?
$1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|
($regexp_bold[3]),defined($1) ?
$1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|
($regexp_bold[4]),defined($1) ?
$1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|
($regexp_bold[5]),defined($1) ?
$1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|
($regexp_bold[6]),defined($1) ?
$1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|
($regexp_bold[7]),defined($1) ?
$1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|
($regexp_bold[8]),defined($1) ?
$1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|
($regexp_bold[9]),defined($1) ?
$1 : "<B>$2</B>",ge;
$hits[$field + $offset] =~ s,(<[^>]+> )|
($regexp_bold[10]),defined($1) ?
$1 : "<B>$2</B>",ge;
}
else {
$hits[$field + $offset] =~ s,(<[^>]+> )|
($regexp_bold[$field]),defined($1) ?
$1 : "<B>$2</B>",ge;
}
}
}
}
return ("ok", @hits);}

With:

Code:
# Bold the results
if ($db_bold and $in{'view_records'}) {
for $i (0 .. (($#hits+1) / ($#db_cols+1)) - 1) {
$offset = $i * ($#db_cols+1);
foreach $field (@search_fields) {

if($in{'bool'}) {
for $m (1 .. $k) {
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[$m]),defined($1) ? $1 : "<B>$2</B>",ge;
@html_search_form;
}
}
else {
$hits[$field + $offset] =~ s,(<[^>]+> )|($regexp_bold[$field]),defined($1) ? $1 : "<B>$2</B>",ge;
}
}
}
}

return ("ok", @hits);
}

I think this will work.

------------------
Alan Pollenz

[This message has been edited by alpollenz (edited April 26, 2000).]
Quote Reply
Re: Boolean MOD In reply to
Brilliant. Very good work. Smile I will add that to the boolean MOD file right now. Thank you!

And it's OK, JPD. I understand this isn't easy to do (well duh! Smile) It just gets frustrating when much of what I ask goes unanswered. This would be such a great tool for everyone if we could get it all put together, it's just not an easy thing to put together!

--Lee
Quote Reply
Re: Boolean MOD In reply to
I've had some comments that this thread isn't so easy to follow. I have deleted several of my previous messages (they were lengthy, and didn't go anywhere...) and it isn't clear where the "working" code is located.

So I thought I'd put a message in here, clearing this up a bit. Smile

So far, the first message has been updated to contain the most accurate code. However, that's not usual around these parts. <G> So, I've made a code web site instead, that will give full instructions on how to get the current working boolean MOD code to work.

http://leisurelee.net/boolean.html

The first record of this thread won't be updated further. Instead, I'll keep the webpage accurate, like it were any other MOD. Wink

I hope this helps clear a few things up!

--Lee
Quote Reply
Re: Boolean MOD In reply to
Al,

I've updated the boolean code from your email. I'm glad you caught it; I haven't had a chance to try it in action yet. I'm shortly going to be working on an issue with one of my earlier MODs, which makes it impossible to use bolding. I'll give it a try after I get that solved. Smile

--Lee
Quote Reply
Re: Boolean MOD In reply to
Ok, all. Somewhat of a Novice here so I'll ask a quick question for you all. If I wanted to implement the Boolean MOD in my html.pl file, would the MOD apply the Boolean conditions to every field in the sub html_record_form? Or would it only apply it to one field named "Keyword" or... what? :) I really want to implement this into my regular search function.

My hope is to have a few checkbox fields on the general search page and have the bool modifier entered through a hidden field (or) so that they can check off a few options in a few fields and have the database return all the records that match any of the fields they've checked. Let me know and thousands of thanks in advance!

(By the way, this is a GREAT MOD... Been using DBMan for about a year now and having no way to do Boolean searches is undoubtedly the place I've had the most trouble... So, way to go!)

--Wretchedhive