Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: ModPerl: ModPerl

GoogleBot

 

 

ModPerl modperl RSS feed   Index | Next | Previous | View Threaded


jmcaricand at greta-besancon

Apr 7, 2009, 10:58 PM

Post #1 of 2 (750 views)
Permalink
GoogleBot

Hi,

I use a PerlTransHandler on my server to get a file :

sub handler {
...

if ( $uri =~ /^\/sitemap.xml.gz$/ ) {
my $real_url = $r->unparsed_uri;

$real_url = '/static' . $real_url;

$r->proxyreq(1);
$r->uri($real_url);
$r->filename(sprintf "proxy:http://xxx.xxx.xxx.xxx%s",$real_url);
$r->handler('proxy-server');

return Apache2::Const::OK;
}

...

return Apache2::Const::DECLINED;
}

When I use Firefox to get sitemap.xml.gz, all work fine :

xxx.xxx.xxx.xxx - - [07/Apr/2009:14:59:35 +0200] "GET /sitemap.xml.gz
HTTP/1.1" 200 2924 "-" "Mozilla/5.0 (X11; U; Linux i686; fr; rv:1.9.0.6)
Gecko/2009020409 Iceweasel/3.0.6 (Debian-3.0.6-1)"

But, when GoogleBot download the file, I see these logs :

66.249.65.228 - - [07/Apr/2009:15:01:41 +0200] "HEAD /sitemap.xml.gz
HTTP/1.1" 302 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)"
66.249.65.228 - - [07/Apr/2009:15:01:44 +0200] "GET /sitemap.xml.gz
HTTP/1.1" 302 451 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)"

Why 2 requests and why my server return status 302 and not 200 ?

Thanks.


mpeters at plusthree

Apr 8, 2009, 5:52 AM

Post #2 of 2 (685 views)
Permalink
Re: GoogleBot [In reply to]

jmcaricand [at] greta-besancon wrote:

> 66.249.65.228 - - [07/Apr/2009:15:01:41 +0200] "HEAD /sitemap.xml.gz
> HTTP/1.1" 302 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
> +http://www.google.com/bot.html)"
> 66.249.65.228 - - [07/Apr/2009:15:01:44 +0200] "GET /sitemap.xml.gz
> HTTP/1.1" 302 451 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;
> +http://www.google.com/bot.html)"
>
> Why 2 requests

The first is a HEAD request
(http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods)

> and why my server return status 302 and not 200 ?

Seems you have some sort of Auth handler in front that does a redirect (which is
what a 302 is). If you want to find out why you should try hitting that resource
with your browser pretending to be the Googlebot. If you're using Firefox you
should look at the User Agent Switcher plugin.

--
Michael Peters
Plus Three, LP

ModPerl modperl RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.