Gossamer Forum
Home : Products : Gossamer Links : Development, Plugins and Globals :

[New Plugin] Spider!

(Page 2 of 2)
> >
Quote Reply
Re: [Teambldr] [New Plugin] Spider! In reply to
Ok, feature is added :) It now gives you the option to either Use LWP::Simple (which doesn't obey robots.txt), or LWP::RobotUA, which does obey robots.txt. You can also define a 2 second delay between requests if you want (slows the process down, but limits the chance of your server getting banned)

Version 1.2

I'm uploading it to the accounts area now, so anyone wishing to upgrade, please login at http://new.linkssql.net/page.php?page=account

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] [New Plugin] Spider! In reply to
You've modified the plugin and tested it thoroughly in under an hour? Wink
Quote Reply
Re: [Paul] [New Plugin] Spider! In reply to
What is there to thoroughly test? LWP::RobotUA has all the rules etc incorporated into it, all you have to do is make a HTTP::Request, and then use LWP::RobotUA to grab it. I tried it on a few sites, and it seems to be working fine. Just cos I don't hang around ;)

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] [New Plugin] Spider! In reply to
Andy:

If you want, test the robots.txt version on my site main page: http://www.bcdb.com/

If it obeys robots.txt, you should get some pages. If it has a problem, you will get banned. You can PM me and I will (of course!) un-ban you.

Thank you for adding that, BTW- I think it is a good option!
dave

Big Cartoon DataBase
Big Comic Book DataBase
Quote Reply
Re: [carfac] [New Plugin] Spider! In reply to
Cheers. It seems to be obeying the rules :) At least I'm not banned yet Tongue

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] [New Plugin] Spider! In reply to
Andy:

See PM!
dave

Big Cartoon DataBase
Big Comic Book DataBase
Quote Reply
Re: [carfac] [Update] Spider! In reply to
Version 1.3.

Now uses Net::Whois::Raw to grab the email address is none was found. The way it gets emails now, is;

+ Checks for email address on page.
+ If not found, then it trys to get the email address via a WhoIs query...
+ If its still not found, it will create the email address from the domain. i.e http://www.ace-installer.com/site/index.php?page=test would become the email address webmaster@ace-installer.com

Anyone who already owns this Plugin, can download the update from http://new.linkssql.net/page.php?page=account

If you are interested in this Plugin, and want to find out more details, please visit: http://new.linkssql.net/...=fee&page=Spider

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] [Update] Spider! In reply to
Version 1.7 is now out (can be downloaded from: http://new.linkssql.net/page.php?page=account).

This version fixes a bug where Net::Whois::Raw was not being included correctly on some servers.

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] [Update] Spider! In reply to
Seems like some servers did not like including Net::Whois::Raw::Data inside Net::Whois::Raw. I've fixed this by editing Raw.pm to hold the codes from Data.pm, and it seems to be working fine.

1.7.1 can be downloaded from your admin folder (if you didn't get any errors with the upgrade, then there is no need to update to this version).

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] [Update] Spider! In reply to
Another fix.

WHOIS.ISI.EDU has gone offline, so I have changed USA whois queries to look up on WHOIS.PUBLICINTERESTREGISTRY.NET.

This is now version 1.7.4.

You can download the latest version from here: http://new.linkssql.net/page.php?page=accounts

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] [Update] Spider v1.6 In reply to
Version 1.6 is now out. This version has a couple of small bug fixes, including the titles not being found if they are on seperate lines.

A new feature has also been added. When starting the Spider, you can define the level of duplicate checking that you want to run. If left as the default, it will just check for EXACT matches that already exist in your database. If you set it to use Advanced Checking, it will do something like;

Input URL: http://www.ace-installer.com/foo/bar.html

URL Grabbed: www.ace-installer.com

Checks: for any URL that contains this URL. The following examples would match;

http://www.ace-installer.com/foo/bar.html
http://www.ace-installer.com/foo/foo.html
http://www.ace-installer.com/cgi-bin/test.php
http://www.ace-installer.com/directory/add.cgi

This doesn't stop you from adding the link at all, it simple gives you a warning (in green), letting you know that other results have been found in your database, with the same a similar URL :)

The latest version can be download from the members area (in my sig.).

For details on this plugin, please see;

http://www.ultranerds.com/cgi-bin/details/20.html

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] [New Plugin] Spider! In reply to
Purchased, installed on 3.2.0 (per Andy), but still not spidering the urls I provide.

Any suggestions would be appreciated.

These are the steps I take:

http://dir.yahoo.com/...les/Web_Directories/
Someone webmaster@somesite.com selected my Category No No Clicked on "Spider it" and this is what I get

Spider

Please tick which sites you want to spider, and get the details for....



SELECT ALL?


> >