Gossamer Forum
Home : Gossamer Threads Inc. : Custom Modification Jobs :

I need a cutomized SUPER fast search function...

Quote Reply
I need a cutomized SUPER fast search function...
I would like to have someone create enhanced search function capabilities for Links SQL.

1) Probably written in C++.

2) Probably would index the Links SQL database into its own database. It would utilize whatever indexing schemes are necessary to achieve virtually instantaneous display of results. (Hardware is not an issue. Will go with any hardware recommendation...within reason... ;-) )

3) Search results would be returned in less than 1 second, preferrably around .1 seconds or less, even for thousands of results returned for the keyword from a database of 3-5 million URLs.

4) Advanced search functions would include, but not limited to: phrase searching, title searching, description searching, limit to urls, etc, etc. Basically all the advanced search functions that major search engines would have, i.e. title: EnterTitleHere, author: EnterAuthorHere, SearchPhrase site:www.AnySite.com etc.

Items #1 - #3, are the most important for me. I'm flexible on #4, but there would have to be some changes to what is currently offered off the shelf with Links SQL.

So, any takers? I have the funds and I am ready to send you payment via PayPal.

Quote Reply
Re: I need a cutomized SUPER fast search function... In reply to
Are you using mod_perl or SpeedyCGI??? If not, you might want to persue using either of these perl packages to enhance your search speed...

C++ does provide a lot of flexibility, but using it would mean a total re-write of search.cgi and would not be portable to future versions...

Try using either mod_perl or SpeedyCGI.

Want more info?

Search for mod_perl.

Regards,

Eliot Lee
Quote Reply
Re: I need a cutomized SUPER fast search function... In reply to
Thanks! I may need to explore that as a first step.

I'm still interesting in pursuing the above, however.

Also, to clarify, the above search program does not really need to be a module of Links SQL. It could be its own program, independent of Links SQL. It would simply index the Links SQL database and return results in fractions of a second.

I know this is possible as I used a program to index 1.5 million URLs. On a PII 233, 128 meg RAM, results for "free" or "internet" returned several thousand listings (off the DMOZ) in .05 seconds.

I would like someone to develop a search program that would search the Links SQL database and return results (formatted per my instructions) with the speed of the above mentioned program. The above mentioned program doesn't index databases, hence I'm looking to someone here to create a program that will.

Quote Reply
Re: I need a cutomized SUPER fast search function... In reply to
In Reply To:
....results for "free" or "internet" returned several thousand listings (off the DMOZ) in .05 seconds.
Yeah but I bet they aren't using a single server with 128MB RAM Wink

Installs:http://wiredon.net/gt
FAQ:http://www.perlmad.com

Quote Reply
Re: I need a cutomized SUPER fast search function... In reply to
I wasn't clear enough. Those results were returned on my box, in my home, not their server. The mySQL database and their program was on my box.

In all fairness, some results take .9 seconds to display, but for the most part everything is displayed in less than .1 seconds, i.e. .09 and under.

Quote Reply
Re: I need a cutomized SUPER fast search function... In reply to
What is the other script you are using?

Installs:http://wiredon.net/gt
FAQ:http://www.perlmad.com

Quote Reply
Re: I need a cutomized SUPER fast search function... In reply to
Go to aspseek.org. Download the RPM for aspseek.

Grab your old pc in your closet that you no longer use. Install RHT and the above RPM. You'll also need mySQL.

Index 1.5 million URLs from the DMOZ. This will only take a weekend, maybe longer depending on your number of threads and bandwidth/CPU/RAM. I believe I did it in less than 4 days, i.e. a weekend Fri - Mon AM.

After it is done indexing. Perform a search on your data. I would say that 80 - 90 % of the time searches are returned in .09 seconds or less.

Finally, as this is getting a little more detailed re: their program, I am now saying 80 - 90 % of the time as I do recall some searches that did take 2 - 3 seconds. I'll leave those in the 10 - 20 % of the time column.

Bottom line...you know the famous engine that starts with a "G"? This is the closest that a normal person will ever get to achieving that level of performance on a normal person's budget. Relevance etc is another issue, (however relevance from what I can see is good, but easily manipulated by spammers). We are just talking performance here, i.e. displaying results in hundredths of a second.

In any event, you will run into the same problem I have. It won't index databases.

Quote Reply
Re: I need a cutomized SUPER fast search function... In reply to
You do know that Google among other search engines have multiple web cluster servers...and to achieve a faster index, one thing you should consider to compete with well-known search engines, like Google, is to have multiple web servers...ones that are dedicated to data storage, a few dedicated to transactions, and a few that store files and web pages.

And testing scripts on a stand-alone server with only one user is not really load testing the capabilities of your program...Wink

Regards,

Eliot Lee
Quote Reply
Re: I need a cutomized SUPER fast search function... In reply to
Yes, of course.

The point to the old pc in the closet thing is that if a piece of junk like my circa 97 pc can do the above, just think what your enterprise class dedicated server with 1 - 2 gigs of RAM will do or a cluster.

Yep, I've read the Google white paper. In any event, I have a dedicated server now. In approximately 3 months depending on traffic I'm thinking of rolling out a 5 node cluster.

As far as users, I get about 300 - 600 unique IP's per hour performing searches so that will definitely be an issue.

Quote Reply
Re: I need a cutomized SUPER fast search function... In reply to
Hi,

If hardware is not an issue, I would recommend:

1. Fast SCSI drives
2. Latest MySQL
3. Enough memory to fit entire database into memory (1 Gig plus)
4. mod_perl

Try Links SQL under those conditions and you'll find almost all searches come back in under 0.1 seconds. Of course, if you are looking for a custom solution, you can always contact us through the site.

Cheers,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: I need a cutomized SUPER fast search function... In reply to
How do you go about putting the entire database in the memory (RAM)?? Is this a function of Mod_perl??

Quote Reply
Re: I need a cutomized SUPER fast search function... In reply to
No, but MySQL and the o/s will cache frequently requested data and put it into memory, this means you have to access disk less and less as time goes by.

Cheers,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: [takacsj] I need a cutomized SUPER fast search function... In reply to
takacsj, are you still around? I'll like to discuss ASPseek with you for a second.