Gossamer Forum
Home : Products : Links 2.0 : Customization :

how to gather URLS for database

Quote Reply
how to gather URLS for database
Does anyone have an good idea how to gather all the urls from links db easily. Hopefully in format:

http://www.foo.bar

http://www.foo2.bar

so one address per page.

I know there is email address collector but s there also url collector somewhere?
Quote Reply
Re: [jansku] how to gather URLS for database In reply to
What do you mean? Something like http://www.linkssuite.com/linkssuite.htm ????

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [AndyNewby] how to gather URLS for database In reply to
Not really, there was an error on my previous post . I ment one url per row not in page.

I have app. 700 Urls on my links database and i just would like to collect them clean format to be able to use them on search engine software which i have allready installed (mnoGoSearch).

Of cource I can copy/paste them from db manually to text file but if there is any other way i would be appreciated to hear it.
Quote Reply
Re: [jansku] how to gather URLS for database In reply to
So you want to copy the links.db over to a new format for this other bit of software? Sounds a bit cheeky if you ask me Wink Post the new format and the links.db format and I'll see what I can come up with to create a new database for the other script (GT won't be happy Tongue)

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [AndyNewby] how to gather URLS for database In reply to
Why is it cheeky?
Quote Reply
Re: [RedRum] how to gather URLS for database In reply to
I dunno Tongue Just thought it sounded a bit cheeky...

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [AndyNewby] how to gather URLS for database In reply to
There is lot of portals around the world which have both link index + web index.

Since Links will not spider those sites it's better to use other software for that purpose .
Quote Reply
Re: [jansku] how to gather URLS for database In reply to
As I said;

Quote:
Post the new format and the links.db format and I'll see what I can come up with to create a new database for the other script

Wink

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [AndyNewby] how to gather URLS for database In reply to
I just actually need urls, nothing else

one url per row and then i will copy them to other softwares config file and it will crawl all the addresses to SQL database. There is no need to convert anythingCool
Quote Reply
Re: [jansku] how to gather URLS for database In reply to
you will get in url.db all url listed on your site and one url per line
Megrisoft
Web Hosting Company
India Software Company
SEO Company


Quote Reply
Re: [megri] how to gather URLS for database In reply to
BlushYou are right . Only thing to do is to remove ID numbers.

Thank's a lot. i never even thought there is such a database available. Shame on me.
Quote Reply
Re: [jansku] how to gather URLS for database In reply to
If you want a simple script to remove the ID's for you try;

Code:
#!/usr/bin/perl

open(IN, "links.db") || &error("Unable to read database...reason: $!");
open (NEW, ">>links.new") || &error("Unable to create new file. Reason: $!");
while (<IN>) {

my ($1, $2) = split(/|/, $_);
print NEW "$2\n";

}

print "Content-type: text/html \n\n";
print "Done updating link.new file";

sub error {

my $error= shift;
print "Content-type: text/html \n\n";
print $error;
exit;

}

This is completly untested...so don't expect it to work perfectly. Could just save you quite while removing all of the link ID numbers manually Tongue

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [AndyNewby] how to gather URLS for database In reply to
That won't work unfortunately.

Try:

Code:
#!/usr/bin/perl

open(IN, "<links.db") || &error("links.db: $!");
open (NEW, ">links.new") || &error("links.new: $!");
while (<IN>) {
chomp;
/^(\d+)\|(.*?)$/ and print NEW "$2\n";
}

print "Content-type: text/html \n\n";
print "Done updating link.new file";

sub error {
print "Content-type: text/html \n\n";
print shift;
exit;
}

Last edited by:

RedRum: Feb 5, 2002, 3:02 PM
Quote Reply
Re: [RedRum] how to gather URLS for database In reply to
Quote:
That won't work unfortunately.

Why not?

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [AndyNewby] how to gather URLS for database In reply to
Code:
open (NEW, ">>links.new")

>> requires that the file exists, and to append to the file.

You'r also adding an extra \n after every line for no reason. Paul chomps the line then appends a \n, which isn't much better cause there is no reason to remove the \n in the first place.

--Philip
Links 2.0 moderator
Quote Reply
Re: [King Junko II] how to gather URLS for database In reply to
Oops...thats one of the bad things of switching between PHP and Perl Tongue

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [King Junko II] how to gather URLS for database In reply to
Well Im afraid you are both wrong.

Using >> will work fine as it creates and/or appends

Secondly I chomped on purpose for consistency (knowing that I didn't need to) however it makes sure that when I reprint nothing gets messed up.

Thirdly Andy's split(/|/, $_) line was wrong.

So there! Cool

Last edited by:

RedRum: Feb 6, 2002, 2:33 AM
Quote Reply
Re: [RedRum] how to gather URLS for database In reply to
Quote:
Thirdly Andy's split(/|/, $_) line was wrong.
Why won't that work?

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [AndyNewby] how to gather URLS for database In reply to
Because / / is the same as a regex (m//) and as | is used in regex to seperate values then it will be interpreted in that way and so if you want to use it literally it must be escaped...eg.. split /\|/; ....also as you can see $_ can be omitted.

m/^(a|b|c)$/;

split /|/; # Bad

split /\|/; # Good

Last edited by:

RedRum: Feb 6, 2002, 3:32 AM