Gossamer Forum
Home : General : Perl Programming :

wget - good way to grab a "list" of files?

Quote Reply
wget - good way to grab a "list" of files?
Hi,

I'm trying to do something for one of my sites, which "grabs" a large number of images (in the region of around 100,000). ATM, it does something like;

Code:
while (<>) {
`wget -OID.gif 'http://www.domain.com/somewhere/image1234.gif'`;
}

Basically, I'm wondering if there is a better way to grab them (all at once) .. i.e a delimited list, or even a .sh file;

Quote:
wget -OID.gif 'http://www.domain.com/somewhere/image12347.gif'
wget -OID1.gif 'http://www.domain.com/somewhere/image12345.gif'
wget -OID2.gif 'http://www.domain.com/somewhere/image12344.gif'
wget -OID3.gif 'http://www.domain.com/somewhere/image12343.gif'
wget -OID4.gif 'http://www.domain.com/somewhere/image12342.gif'
wget -OID5.gif 'http://www.domain.com/somewhere/image12341.gif'
...etc

Currently, the wget method is taking too long .. especially with the number of images I'm trying to grab.

TIA

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] wget - good way to grab a "list" of files? In reply to
the tick operator spawns your script.

why not using perl solution, which doesn't start another process?

For example, LWP module

Best regards,
Webmaster33


Paid Support
from Webmaster33. Expert in Perl programming & Gossamer Threads applications. (click here for prices)
Webmaster33's products (upd.2004.09.26) | Private message | Contact me | Was my post helpful? Donate my help...
Quote Reply
Re: [webmaster33] wget - good way to grab a "list" of files? In reply to
Hi,

Yeah I tried using LWP::Simple::get(), but its almost as slow as wget :/

Surely there must be something where you can enter a list of URLs in a text file, and then have something go through and grab them Crazy

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!
Quote Reply
Re: [Andy] wget - good way to grab a "list" of files? In reply to
 
LWP::Parallel might be your friend:
http://search.cpan.org/.../lib/LWP/Parallel.pm

Best regards,
Webmaster33


Paid Support
from Webmaster33. Expert in Perl programming & Gossamer Threads applications. (click here for prices)
Webmaster33's products (upd.2004.09.26) | Private message | Contact me | Was my post helpful? Donate my help...
Quote Reply
Re: [Andy] wget - good way to grab a "list" of files? In reply to
Do you have a solution yet? If not, did you ever consider 'man 1 wget'?

Specifically
wget -i some/file

where 'some/file' contains a list of URLs.
Quote Reply
Re: [mkp] wget - good way to grab a "list" of files? In reply to
Hi,

That worked a charm - thanks Cool

I needed to grab 600,000 images, to running 600k seperate queries was impraticle (not that this way makes it much better, but still <G>).

I'm seperating them into;

/images/a
/images/b
/images/c
/images/d
/images/e
..etc

..based on the first character of the image name :)


Thanks again.

Cheers

Andy (mod)
andy@ultranerds.co.uk
Want to give me something back for my help? Please see my Amazon Wish List
GLinks ULTRA Package | GLinks ULTRA Package PRO
Links SQL Plugins | Website Design and SEO | UltraNerds | ULTRAGLobals Plugin | Pre-Made Template Sets | FREE GLinks Plugins!