Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Interchange: users

Maximum size for a session?

 

 

Interchange users RSS feed   Index | Next | Previous | View Threaded


DB at M-and-D

Nov 15, 2009, 7:36 AM

Post #1 of 2 (727 views)
Permalink
Maximum size for a session?

Recently I saw Yahoo's shopping bot hitting my site pretty hard. Apache
access log has lines like this every 20 seconds or so:

...HTTP/1.0" 200 19391 "-" "YahooSeeker/1.2 (compatible; Mozilla 4.0;
MSIE 5.5; yahooseeker at yahoo-inc dot com ;
http://help.yahoo.com/help/us/shop/merchant/)"

I saw no entry for this in my system's robots.cfg and I suspect (can't
prove) that this robot was obtaining a session which grew *very* large.
So I have two questions:

What exactly should I add to my robots.cfg

and

Is there a way to set a maximum size for sessions so that the next time
a robot that's not in my robots.cfg file comes along this problem won't
repeat?

DB

_______________________________________________
interchange-users mailing list
interchange-users [at] icdevgroup
http://www.icdevgroup.org/mailman/listinfo/interchange-users


jon at endpoint

Nov 16, 2009, 4:17 PM

Post #2 of 2 (652 views)
Permalink
Re: Maximum size for a session? [In reply to]

On Sun, 15 Nov 2009, DB wrote:

> Recently I saw Yahoo's shopping bot hitting my site pretty hard. Apache
> access log has lines like this every 20 seconds or so:
>
> ...HTTP/1.0" 200 19391 "-" "YahooSeeker/1.2 (compatible; Mozilla 4.0;
> MSIE 5.5; yahooseeker at yahoo-inc dot com ;
> http://help.yahoo.com/help/us/shop/merchant/)"
>
> I saw no entry for this in my system's robots.cfg and I suspect (can't
> prove) that this robot was obtaining a session which grew *very* large.
> So I have two questions:
>
> What exactly should I add to my robots.cfg

Are you sure that it was not being flagged as a robot? I'm pretty sure
that the "Yahoo" entry in the default robots.cfg will catch "YahooSeeker"
as well. Take a look at your interchange.structure file with debug
enabled, and you can see the regex created for the RobotUA directive, and
Yahoo isn't anchored so should match YahooSeeker too.

(In this case that's good, but in other cases you may find a RobotUA
setting matches too loosely, such as "Google" matching "GoogleToolbar" or
similar.)

> Is there a way to set a maximum size for sessions so that the next time
> a robot that's not in my robots.cfg file comes along this problem won't
> repeat?

I don't know of a way to limit the size of a session proactively.

Jon

--
Jon Jensen
End Point Corporation
http://www.endpoint.com/

_______________________________________________
interchange-users mailing list
interchange-users [at] icdevgroup
http://www.icdevgroup.org/mailman/listinfo/interchange-users

Interchange users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.