Gossamer Forum
Home : Products : Gossamer Links : Discussions :

Fast search!

(Page 1 of 2)
> >
Quote Reply
Fast search!
What to make it LinksSQL to be so fast in the research how much yahoo? Why the Links delays in such a way to show the result of a search with only 800 links found? delay in average 3 seconds and this is very slow, yahoo makes the same research in thousandth of seconds, where is the solution for this?

Alex this can be the data base?
I have 500.000 here links and a search for a 20.000 word that returns links as resulted takes in average the 12 - 15 seconds to return a result, being that in yahoo or the altavista or google this does not take nor 1 second, what we can make for linksSQL to be faster?

To have an idea of that I speak it makes a research in the demo of the LinksNG for a word whom 1.000 have more than links as reply, delay in average 3 seconds! exactly using mod_perl. And you would have to work more in documentation on as to use together FastCGI and mod_perl in linksSQL.

-=-=-=-=-=-=-=-=-
Janio Le
-=-=-=-=-=-=-=-=
Quote Reply
Re: Faster search! In reply to
Hi Janio,

Links SQL will definately scale up to meet your needs. We have the full import of DMoz (2 Million links and 300,000 categories) and most searches come back in less then 2 seconds. Also, MySQL and the o/s does a lot of caching so while your first search may be slow, future ones will be a lot quicker.

I'd be happy to email you a URL to our DMoz demo (it's behind a DSL line so the internet connection is not great, not suitable for general public).

Our experience is for the best results you should have:

1. mod_perl/fastcgi/speedyCGI installed. This is where you will see your largest performance increase.
2. MySQL 3.23 installed. We have found 3.23 to be significantly faster then 3.22 or Oracle or MS SQL 7.
3. Memory. You should have as much memory as the size of your database. You never want to have to access the disk unless you can avoid it.
4. A fast disk drive. A quality disk drive really helps (u160 SCSI).

Again, you'll find that as the application runs, the O/S and MySQL does a really good job of caching results, so that while a search may take a few seconds the first hit, over time the most popular searches come back right away. You can see this by hitting the Next in the toolbar. It's the exact same search but is significantly faster.

Hope this helps,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: Faster search! In reply to
So how much memory to run DOMZ?



PUGDOGŪ
PUGDOGŪ Enterprises, Inc.
FAQ: http://pugdog.com/FAQ


Quote Reply
Re: Faster search! In reply to
Our test database is a piii 800 with 1 gig of memory and a 18 GB u160 scsi drive.

Cheers,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: Faster search! In reply to
that is pretty impressive. What database program is dmoz.com using? is it mysql or have they gone for oracle/sybase/etc.? Out of interest, nasa just switched from oracle to mysql.

http://www.ASciFi.com/ - The Science Fiction Portal
Quote Reply
Re: Faster search! In reply to
Ok alex, you always speak in cache in mysql, it you do not function in oracle?

Then if I have 3.000.000 of links I I go to need that how many memory?


-=-=-=-=-=-=-=-=-
Janio Le
-=-=-=-=-=-=-=-=
Quote Reply
Re: Faster search! In reply to
3,000,000 links! wow. what are you building?

http://www.ASciFi.com/ - The Science Fiction Portal
Quote Reply
Re: Faster search! In reply to
I made to spider to test a data base really great, but still I am testing my software, and has some feeds, but I can get up to 100,000,000 depending on the equipment that I to use, but very leaves expensive to have one link really good for a seek machine that uses system to spider as the altavista.

But what I want at the moment and that the LinksSQL is so fast when any another used software for the great search engine of the Internet, as the Yahoo, google or altavista.

I only want that the user is not waiting 7 seconds for a reply its search, understood?


Alex I find that you must place some options in the installation so that if the system will have fastCGI or speedCGI installed or even though mod_perl not to need to make manual alterations en each script to be able to use fastCGI or Speed.



-=-=-=-=-=-=-=-=-
Janio Le
-=-=-=-=-=-=-=-=
Quote Reply
Re: Faster search! In reply to
Unfortunately I do not know enough about Oracle to begin to be able to tune it or make recommendations. Sorry!

Cheers,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: Faster search! In reply to
In Reply To:
Alex I find that you must place some options in the installation so that if the system will have fastCGI or speedCGI installed or even though mod_perl not to need to make manual alterations en each script to be able to use fastCGI or Speed.
SpeedyCGI support is now built in, and to enable you just set the path to speedy during setup.

FastCGI is a little trickier as it involves putting the script in a while loop and altering the code.

Cheers,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: Faster search! In reply to
In Reply To:
Unfortunately I do not know enough about Oracle to begin to be able to tune it or make recommendations. Sorry!
It would not be the same used logic in mysql? e in the MSSQL functions?

-=-=-=-=-=-=-=-=-
Janio Le
-=-=-=-=-=-=-=-=
Quote Reply
Re: Faster search! In reply to
No, Oracle is magnitudes more complex to manage then MySQL. It requires a lot of experience and knowledge to understand Oracle and to be able to tune it properly.

Unfortunately I know just enough to get it installed and use it, but not enough to be able to get the best performance out of it, whereas I am very comfortable with MySQL and know what works well with it and what doesn't.

Cheers,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: Faster search! In reply to
Alex,

Are you saying (see above) that you are using FastCGI and SpeedyCGI in combo with mod_perl? If that is the case a little documentation on how you configured Apache this way would be MOST appreciated.

Wil

Quote Reply
Re: Faster search! In reply to
An additional problem with Oracle, if you are not using all the extra features it offers, is that MySQL can be backed up with the database in operation. Oracle _cannot_ It has to be taken off-line for the backups.

I think this has to do with the fact it supports transaction processing, so the "slice" has to be the same across the entire system for that instant (all files have to be on the same transaction, or rollbacks and problems occur).

Once MySQL supports transactions, that might change in MySQL, although because it was built from the ground up to be different, and well thought out, they may have a way to get around that too.

Unless you really need the features of Oracle (and I do guess Oracle works better on Windows environments than MySQL), mysql is really a good choice! You can always upgrade later if you need to, but you don't kill yourself trying to deal with Oracle _AND_ setting up a site.

PUGDOGŪ
PUGDOGŪ Enterprises, Inc.
FAQ: http://pugdog.com/FAQ


Quote Reply
Re: Faster search! DATABASE In reply to
Somebody could explain me because nobody likes it ORACLE?

then says which to me the faster data base for a site of search?

Which the used data base for the Yahoo?
Which the used bank given for google?
Which the used bank given for the altavista?
Which the used bank given for excites it?

necessary to know that software to use to have an excellent performance in my site.




-=-=-=-=-=-=-=-=-
Janio Le
-=-=-=-=-=-=-=-=
Quote Reply
Re: Faster search! DATABASE In reply to
hello
the company I work for, use Alta Vista.
Alta Vista is a Re-Index engine and NOT a database engine.
The largest Search Engines works with the re-index system, although you can use databases if you wish.

Now I need MySQL for Ridesworld, but they heard about ORACLE.
Realise one thing:
MySQL is freeware and it can run on several platforms.
ORACLE is shareware, and ok it can run on several platforms, but you have to pay an amount for it.

Most database driven websites use MySQL because it is fast light and stable. (I think DMOZ too, but I'm not sure)
the only thing MySQL can't is the use of Subqueries, but this is already parsed in the Links-SQL engine.

I highly reccomend you to use MySQL, and forget the large search engines because they work on an complete different engine system.

Quote Reply
Re: Faster search! In reply to
In Reply To:
Are you saying (see above) that you are using FastCGI and SpeedyCGI in combo with mod_perl? If that is the case a little documentation on how you configured Apache this way would be MOST appreciated.
No, SpeedyCGI and FastCGI are alternatives to mod_perl, and you only use one at a time.

SpeedyCGI is by far the easiest to use. In it's simplest form you just install it, and change the path to perl to /usr/bin/speedy. Now when you run a script, it runs speedy which passes the request to speedy_backend which has your script already loaded and ready to go.

You will see a better performance boost using mod_perl, or implementing the mod_speedycgi into apache, however that requires altering your webserver which is a little more complex.

Cheers,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: Faster search! DATABASE In reply to
In Reply To:
work on an complete different engine system.
How it is this of re-index system?
It is not a relationary data base?
that technology they use? this can be made for the LinksSQL?


-=-=-=-=-=-=-=-=-
Janio Le
-=-=-=-=-=-=-=-=
Quote Reply
Re: Fast search! In reply to
Hi Alex!

In this new version in the LinksNG the search in the data base was faster?

A suggestion: The system of cache would have is installed by default and not as one plugin that the administrator of the site has that to generate and to install.

Therefore systems of cache leave the fetchings fastest, and you it would have to work more in the system of cache improving it.

What I do not understand is because the LinksNG is very slow in the search in great data bases, being that I have application if Search Engine that works with index as google that wheel in bank mySQL and is super fast, being that the search software is written in C++, but finds that the language where is made software it does not influence very.

What it can be made LinksNG to be faster?
(without having that to use others softwares as mod_perl, fastCGI or speedCGI).

The LinksNG is a powerful tool of fetching but she is slow, has an area of very good management, but because to be so slow?

It would not be the case of if rewriting the search engine of the LinksSQL?

For example to rewrite search.cgi in C++?

I wait that they do not understand me badly, but am unsatisfied with links for the fact that that the search is slow, different of the others softwares that it has in the market.


Happinesses

For a better world.Cool

-=-=-=-=-=-=-=-=-
Janio Le
-=-=-=-=-=-=-=-=
Quote Reply
Re: Fast search! In reply to
Part of the benefit of Links SQL is that it's pure perl. It will run on any system. Once you start to add C code, you need to compile the program on each platform, and you start hitting all sorts of problems tangential to the inner workings of Links.

Perhaps 3rd parties will come forward with betting indexing/searches that can be compiled. But, they will have to provide support above and beyond the standard. It's not easy to come up with C code that will compile on all the different Unix flavors, with all the different compilers, and then under windows.

C code is great for the libraries and routines of a program like MySQL or Perl itself. Once you start getting into applications themselves, the more C you start to use, the less advantages of PERL (maintainability, portabilty, etc) you keep.

I remember when the biggest rallying cry for "global acceptance" of C was "It's portable!". We all know how that has played out in the past 15 years. :)



PUGDOGŪ
PUGDOGŪ Enterprises, Inc.
FAQ: http://pugdog.com/FAQ


Quote Reply
Re: Fast search! In reply to
Ok Robert!

but what to make then it LinksNG to be faster in the search?

-=-=-=-=-=-=-=-=-=-
For a better world.Cool
Janio Le
-=-=-=-=-=-=-=-=-=-
Quote Reply
Re: Fast search! In reply to
I gather you are using the NON indexed search. The only "good" way to speed that up, is to index it. But, that has to be language specific. If you don't index the searches, then you are doing full-text searches from the start of the database to the end of the database on each search. That is "brute force" and not very efficient, fast or pretty.

Again, this would be an area for 3rd parties to explore in targeted areas, or perhaps even an external search engine program. There have been many, many efforts to create efficient, fast and flexible searches by several companies. You might want to search the web.

Indexed searches should be pretty fast. If they are not -- then you should consider increasing your RAM, adding faster harddrives, getting a faster processor (CPU) or perhaps splitting the load between two machines.

If you are not on a dedicated server, then all this is moot, and the solution is to get a dedicated server first.



PUGDOGŪ
PUGDOGŪ Enterprises, Inc.
FAQ: http://pugdog.com/FAQ


Quote Reply
Re: Fast search! In reply to
Ok I have a dedicated server, now as to divide the system to use two machines?


-=-=-=-=-=-=-=-=-=-
For a better world.Cool
Janio Le
-=-=-=-=-=-=-=-=-=-
Quote Reply
Re: Fast search! In reply to
Hi Janio,

In Reply To:
What it can be made LinksNG to be faster?
(without having that to use others softwares as mod_perl, fastCGI or speedCGI).
To get better performance, you need to use mod_perl, fastCGI or speedyCGI. The CGI protocol is a slow protocol, and is often the largest bottleneck in terms of performance.

Moving from regular CGI => mod_perl you will easily see a 5 to 10x increase in performance for most applications.

If you are serious about increasing performance, this is what should be done first.

Cheers,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: Fast search! In reply to
First, as I said, make sure you have enough RAM. On a Windows machine, 1gb is not unreasonable. On Unix, 256 meg is good for Sparc, 512 is much, much better, especially for Intel. You'll probably see performance increases on the Intel platform up through 1GB if your mother board/bios handles it properly. 512 on a Sparc was almost unheard of a few years ago <G>... but if you think life was good with 128 meg..... :)

Then, as Alex said, move to mod_perl first. (or speedy cgi). That will give a performance increase with the currently available resources.

The caveat is, that apache_mod_perl is _BIG_ its an easy 3-4x as big as apache without mod_perl. That means, you need more RAM, or you start to lose your benefits in using disc cache memory. Figure at least 4 meg of ~RAM in Unix for Apache/mod_perl per process. 25 processes is 100 meg of RAM, otherwise you are swapping. Add a 16 meg MySQL proces and the Unix kernal, and 128 meg of ram starts to look pretty small. Of course, it _will_ work. It will work pretty well. It will just work better with more RAM.

Once you do that, you need to see where your real bottlenecks are. You might need some system management tools or even an analyst to look at your traffic. What you need to do at that point "load balance" between two or more servers, such that each server is doing a fair share of work, without duplicating stuff. For instance, putting the database on one machine, and optimizing that for database serving, having a separate network connection to that machine, and using the front end machine for web serving might be good. Then, again, maybe running fat and thin apache versions (with and without mod_perl) may be better -- depending on what your site is doing.

There is no one answer, and load-balancing and networking is an art, more than a science.

But, first check your RAM and CPU, then move to mod_perl.

The point I'm making is make sure your hardware is up to the job, before you start to try to squeeze every drop of performance out of it. Sometimes, that is a losing battle. After that you'll need to figure out what is taking the most resources, and how best to break up the workload.

PUGDOGŪ
PUGDOGŪ Enterprises, Inc.
FAQ: http://pugdog.com/FAQ


> >