Gossamer Forum
Quote Reply
Index Speed
I am running the spider on a dedicated server, 256 MB of RAM and a low load. I am just wondering how fast the spider "usually" works? I think right now it is averaging about 15 seconds per page - does that sound about right? I guess it isn't too bad, but I hoped it would be faster with over 4000 pages to index. =) At this rate it probably won't be done until sometime tomorrow afternoon.

I'd hate to see how long it would take for a bigger job. ;)



Rob Bartlett
AAA Internet Publishing, Inc.
http://www.AAAInternet.com
Quote Reply
Re: Index Speed In reply to
This is very strange, can you send Aki a message and he'll look into this. We were averaging around 2 to 3 per second quite easily (not much load on the server).

Cheers,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: Index Speed In reply to
Ok, I will do that. It seems to actually be getting slower, I just checked the status again, and only about 25 links have been indexed in the last hour. =P

The spider itself doesn't have much of a load on the server (we still have over 90% idle), but when I run the "spider status" option in admin, it takes up ALL of the CPU usage for some reason...



Rob Bartlett
AAA Internet Publishing, Inc.
http://www.AAAInternet.com
Quote Reply
Re: Index Speed In reply to
I have been watching my server and it appears to be automatically stopping any scripts that run for more than several minutes. The daemon seems to go on for a bit longer, but it is also much less CPU intensive, and it does gradually seem to keep getting slower.

I contacted my web host about this, but is there anything that can be done on the script side of things to help this out? For example, I can't index my spider database because it stops after about 3.5 minutes at 25k records. It wouldn't be so bad if I could just continue the job, but it forces me to start from scratch.

Any ideas?

Thx



Rob Bartlett
AAA Internet Publishing, Inc.
http://www.AAAInternet.com
Quote Reply
Re: Index Speed In reply to
I have discovered that if I reindex the spider database from telnet, there aren't any problems - it takes a really long time to finally get the job done, but it does eventually get done. The problem appears to be from the browser timing out on larger jobs - it must send a message back to the server to stop the script because the browser figures that there is nothing happening since there is no information sent back from the server.

Isn't there any way to prevent this from happening? I mean, if you are going to have a web interface, we should at least be able to use it instead of having to go into telnet.

Thx



Rob Bartlett
AAA Internet Publishing, Inc.
http://www.AAAInternet.com