Gossamer Forum
Home : Products : Links 2.0 : Customization :

Spider Mod

(Page 1 of 3)
> >
Quote Reply
Spider Mod
How about this as a Links Mod:

Some people have been asking for Links to also spider links. It is not a modification that Alex wants to add (for good reason) because Links in modeled after Yahoo and manually adding new listings ensures quality control and reduces spamming. Plus, spidering can create a database that can very quickly explode in size to MBs and even GBs.

My solution is that I've modified the xav script (ftp://ftp.xav.com/search.txt) as follows:

1. Upload the spider.cgi mod script.

2. Then simply modify the site_html.cgi script to include the link (text or icon linked) - spider.cgi?URL=someurl where someurl is the listing URL like http://www.myhome.com.

2. The script spiders the page and displays the results in a new window as linked URLs. However, the spidered pages are not tracked or logged.

I tested it and it works. I have to approach the XAV guys for permission. If I get the okay, the mod of course will be offered free - details to be released later.

Dan Smile
Quote Reply
Re: Spider Mod In reply to
Hi Dan,
So, what I'm guessing here is that what your saying in it's most simplest terms for those of us that are less then..... Well, lets just leave it as less then.....
You would have another link on the outputed pages, for example, you have a catagory called Automotive. On the automotive page there is a link to Honda, then somewhere in the description or somewhere close to that link would be another link that says something like spider site, which would invoke the spider.cgi program to open up a new window and show the results of it's spidering of the honda site.
Is this correct?
What happens if a site doesn't let it spider it, will it return a blank page, or will it return an error message or do nothing?
I could see the possibilities here, with some kind of tag like, [More from this site] linking to the spider.cgi file.

The new way that we spider to get our urls for our database is we run a metacrawler service, so we actually download the cached results of what people surf for, and reformat the results in excel, then upload the results to our data directory as validate.db then we just go in as usual and pick a category if we're going to keep the site in our listing. To make it simple, since we often will enter in hundreds of links every time, we changed the links.def file so that the validate box is pre check marked, and the contact name is set to us as internal, ad email is a null aliased email to our domain. Which makes entering these things much simpler and quicker, we can often validate about 3 to 400 sites a night before getting really bored and stopping for the night.
Just my two cents for those people trying to figure out how to get links quickly, since we signed up with all the big submital services to receive their clients listings, and have gotten no responses from them, which I guess I'm kinda glad of since I don't want to have to hand enter hundreds of sites per day by email to form.
Anyway, I think that the spider the sites for a [More from this site] link is pretty kewl. Let me know if you get it worked out.
Visionary
Quote Reply
Re: Spider Mod In reply to
The script follows the robot.txt and meta tag exclusion standards. If spidering is prohibited then the result page will indicate as such in the form of an error message.

Dan Smile
Quote Reply
Re: Spider Mod In reply to
How Can I Have The SPide MOD so webmaster can submit their sites automatically, and I will validate it later...

Is there a MOD for it???

Thanks

------------------
WebKing
WebKing@trisoft.net
http://www.trisoft.net
My ICQ # 25356171
Quote Reply
Re: Spider Mod In reply to
The spider mod does NOT augment the links database. Spidered URLs are not saved. They are simply presented to the surfer when he clicks on a link like "[More...]" that accompanies a Links listing.

Very simple and avoids the problem of diskspace consumption which is common when you spider sites to a database. However, of course, spidered URLS are not searchable as they are NOT compiled. Just gives the surfer a bit more variety and negates the necessity of a listing submitter of submitting more than page for a given site.

It is NOT a mod for AutoSubmitter Submit and does NOT submit URLs.

If and when I get permission, I'll post a URL for the mod which will include fully setup instructions - as I say, very simple as only requires uploading the spider.cgi file and making a quck one line modification to the sites_html.cgi file.

Dan Smile
Quote Reply
Re: Spider Mod In reply to
Hi,

You can give me the spider.cgi?

[]'s
Jozeph
Quote Reply
Re: Spider Mod In reply to
WebKing - The original script at ftp://ftp.xav.com/search.txt is what you need.

jozeph - Sorry but until I receive expressed permission from XAV, I cannot make the mod available. To do so would constitute copyright infringement. I have sent off an email and I'm awaiting a reply.

Dan Smile
Quote Reply
Re: Spider Mod In reply to
Templates " in HTML would like to suggest the creation of pages " to facilitate the modification of the SITE, as in script LINKS 2,0.

I sincerely, adored that one script... to put, it I do not make a thing that I want... that he has in Xav II, that it is the Robot...

E Xav II does not have the TEMPLATES...: (( It would have as to adpatar Xav II for Links 2.0? or to add TEMPLATES of Xav II?

Why not to remake Xav II and leaves it more easy?

Do you know Yahoo???

I am trying I make it equal to the Yahoo... I am trying to place the Robot of Xav II of Links 2,0 inside.
The person would make the following one to register in cadastre a URL:

She chooses the category, she places the Full Name, email and the address of the site, former:

Dan Dan dan@dan.com http://dan.com Category = the category is automatic, is using HARD CATEGORY.

With this the Robot catches the remaining portion... descriçao of the site, title... etc...
Quote Reply
Re: Spider Mod In reply to
I Wish To Have The SPIDER.cgi

also

Step-by-step installation instruction.

Is it a MOD for AutoSubmitter Submit webpages to the LINKs???

Thanks!

------------------
WebKing
WebKing@trisoft.net
http://www.trisoft.net
My ICQ # 25356171
Quote Reply
Re: Spider Mod In reply to
Hi Guys!

Dan you create the spider mod?

I need this mod... Wink you send it in my e-mail??

lsaud@manaus.br



------------------
[]'s

Lucas Saud - #19815087







Quote Reply
Re: Spider Mod In reply to
Hi:

I have received permission for the mod from XAV. My ISP (@home) has been offline for the past couple days so I'm a tad behind in things but I should have the mod up this weekend - the URL 'will' be www.monster-submit.com/mods02.html .

Dan Smile


[This message has been edited by dan (edited March 19, 1999).]
Quote Reply
Re: Spider Mod In reply to
LINKS V.2.0 *NO TEMPLATE*
----------------------------

It would be HELPFUL if SPIDER is available for LINKS v.20. Do you have it?



------------------
WebKing
WebKing@trisoft.net
http://www.trisoft.net
My ICQ # 25356171
Quote Reply
Re: Spider Mod In reply to
I know this is the wrong forum for this question. I noticed a comment about robot.txt and could not help but ask this question I have had for a while.

=========
script follows the robot.txt and meta tag exclusion standards.
========
I have submitted my site to Lycos at least about 50 times now and it has not been indexed. In my log I have reference to "robot.txt not found" with respect to lycos and some other search engines, whenever they crawl. The thing is I do not want exclde anything from indexing. I have meta tags in my html pages <META NAME="ROBOTS" CONTENT="ALL">

Now, do I still need this robot.txt file? Does this have anything to do with my site not being indexed, particularly by locos? Also, another thing I have noticed is that in most other search engines, when I resubmit I will have say 10 or 20 pages indexed. But, if I try searching again, in a week or two I only find one or two pages.

Anyone knows more?


[This message has been edited by socrates (edited March 19, 1999).]
Quote Reply
Re: Spider Mod In reply to
As far as I am concerned, if you do not have any robots.txt it should allow ALL.

------------------
services.curryguide.com
Quote Reply
Re: Spider Mod In reply to
Hi dan!

I need your help to modify the xavatoria script.

Actualy i use this script in one search for files (mp3) but i need help to modify the search results (print only the file size, title, adress) and put all admin funcions in another file (admin.cgi) and all AddSite functions in add.cgi.

You help-me?

------------------
[]'s

Lucas Saud - #19815087







Quote Reply
Re: Spider Mod In reply to
Hi Lucas:

That is a big job. Sorry but I'm swamped with projects and just don't have the time to take on a new project of this size. Maybe someone else here?

Dan Smile
Quote Reply
Re: Spider Mod In reply to
Hi dan!

tanks for the reply.

you dont have time to help-me...ok! Wink

in my searcch i have 1000sites and the script search the file, very slow...

you have one mod or tip to my search run more faster? Smile

------------------
[]'s

Lucas Saud - #19815087







Quote Reply
Re: Spider Mod In reply to
Hi Dan,

Can you give me this modification now? :>

Thanks,
Jozeph
Quote Reply
Re: Spider Mod In reply to
Hi,

Dan, What's your email address?

Thanks,
Jozeph
Quote Reply
Re: Spider Mod In reply to
Dan,

First off.. Great idea and Thanx.. Second are you going to be updating this for v2 of Links also?

And lastly I read on your site.. "Links Spider is absolutely and unconditionally FREE - Freeware script! " Personally (just my opinion and as you know.. "They are like @sses, everyone has one" but I think you should really consider this change...

"Links Spider is absolutely FREE for Non commercial use. "Use" does not constitute legal rights for resale. - Freeware script!"

But hey.. thats just my opinion :P
Quote Reply
Re: Spider Mod In reply to
Dan,

Please, give me the spider.cgi... Smile
Quote Reply
Re: Spider Mod In reply to
Hi Dan:

You send-me one e-mail when you upload spider.cgi?

Tanks.


To jozeph:

Porquê você não espera o Dan colocar o script no ar? Ele já disse que está com problemas no servidor dele cara..calma um pouco né? Smile

PS: O seu site GlobalMedia, é muito legal só que o vizual atual está deixando o site muito lerdo.... Wink

Tente usar menos figuras e não coloque muitas tabelas dentro de outras tabelas.

Se você gosta do Xavatoria, entre em contato comigo (lsaud@manaus.br) eu fiz umas modificações legais nele.

------------------
[]'s

Lucas Saud - #19815087







Quote Reply
Re: Spider Mod In reply to
Jozeph - sorry but the mod script will be available only from www.monster-submit.com/mods02.html . It will be available tomorrow.

Lucas - given how busy I am, I cannot promise an email but it will be available for download tomorrow so return to the URL above tomorrow.

WebKing + Rick - I'll see about modifying the script for Links v2. Shouldn't be too hard. And regarding the freeware note, the script is free for both personal and commercial use, like the XAV script. But I will add "Use does not constitute legal rights for resale". Thanks!

Dan Smile
Quote Reply
Re: Spider Mod In reply to
Links Spider is now available for download - for both Links 1.0+ and 2.0+.

Consider this a 'beta version' as there are still some bugs to work out. The script works fine but for some reason bypasses the robots.txt exclusion file. Further, the script will spider pages using the <meta name="Robots" content="noindex"> meta tag when spidering a 'legal' page (i.e., a page that either omits the meta tag or has it set as noindex).

I'll try to correct these bugs this week. However, if the page specified in the links database is set to disallow indexing and/or spidering, the script will not spider the page.

Dan Smile
Quote Reply
Re: Spider Mod In reply to
Hi Dan:

i need your help to add one search results bar in my Xavatoria search....(type of Altavista)


you can help me? (if you need , i pay for you Smile )

------------------
[]'s

Lucas Saud - #19815087







> >