Gossamer Forum
Home : Products : Gossamer Links : Version 1.x :

Find most common words in database for stop words?

Quote Reply
Find most common words in database for stop words?
I have searched the forums for this but haven't found anything like it, unless I overlooked it. I was wondering if anyone has done anything that can be used as part of the admin process to get the top 100 words that are listed in the database. The reason I ask this is that I'm putting together a sports directory and if someone searched on say football or nfl, they would get some huge list of categories and links, to the point it's probably unusable. What I was hoping to do was to search the database say monthly for the top 100 words listed in the database (title, description, etc...) and then selectively (and manually) add those to my stop words, based upon the word. I'd probably add some extra code to the <%if ignored%> tag that explained what I was doing...

To make it more complicated, what would be great would be:
- ignore the word if it's by itself
- if it's with other words in the search, then force the + sign in front of each word to force an AND search because that's probably what the visitor wanted anyways, ie, if they searched on "nfl news" I doubt they would want sites that had "curling news" just because news was in the search. Would also be cool if the search knew this that a stop word was used so I could output an explanation of what I did in regards to the search and not totally confuse the visitor.

Jerry



Subject Author Views Date
Thread Find most common words in database for stop words? JerryP 1961 Jun 12, 2000, 2:38 PM
Thread Re: Find most common words in database for stop words?
pugdog 1873 Jun 12, 2000, 6:46 PM
Post Re: Find most common words in database for stop words?
JerryP 1861 Jun 12, 2000, 10:22 PM