Gossamer Forum
Quote Reply
SPIDER robots.txt
most of my spidered links are http://domain.com/robots.txt as URL

can i hide alle this robot.txt urls?

Quote Reply
Re: SPIDER robots.txt In reply to
I was wondering about the same thing.

You're supposed to put any of your 'private' directories in there to stop them being spidered, but then anyone can just bring the robots.txt file up in their browser and see what you're trying hide.

Would also like to hear if anyone has an answer that they use?

Cheers,

R.


Quote Reply
Re: SPIDER robots.txt In reply to
WHOAH

I thought it was a long shot but it worked...

Here's Yahoo's robots.txt file...

Code:
User-agent: *
Disallow: /gnn
Disallow: /msn
Disallow: /pacbell
Disallow: /pb

# Rover is a bad dog <http://www.roverbot.com>
User-agent: Roverbot
Disallow: /

Installations:http://www.wiredon.net/gt/

Quote Reply
Re: SPIDER robots.txt In reply to
i know robots.txt files, but i donīt know, why this files where spidered in my search engine.

the robots.txt says, i thought which documents where allow to be spidered. why i have to validate a robots.txt file???

Quote Reply
Re: SPIDER robots.txt In reply to
Ahh sorry, I thought you were talking generally about Robots.txt files. I didn't read it properly that you were having problems with the SPIDER plugin - I don't know the answer to that one.

Cheers,
R.


Quote Reply
Re: SPIDER robots.txt In reply to
Hi Hemen,

The easiest way of dealing with this is to go into Plugins->Database-> Bulk Operations, choose Links and click on Go.

Look for any URL containing robots.txt.

When you have a collection of all such links, delete them from the database.

Hope this helps



Quote Reply
Re: SPIDER robots.txt In reply to
Always, I using MySQLMan to clear all fields, because the spider spiders waste.

-------------------
Heiko Mentzel
http://findgay.net/
-------------------