Gossamer Forum: General: Perl Programming: copy and past script

Jun 1, 1999, 3:07 AM

Diemo

Novice (29 posts)

Jun 1, 1999, 3:07 AM

Post #1 of 15

Shortcut

copy and past script

Hi,

this is something totally different but I need the following:

Lets assume there is a html page ( with no frames, tables etc ) on the web which updates one paragraph every 5 minutes, the rest keeps the same.
I want to copy this and only this paragraph and past it into a new html file.

I looked around but I didn't find a script which can can do this or can be modified to do so.
Any hints ...

Bye Diemo

Jun 1, 1999, 1:29 PM

Alex

Administrator (9387 posts)

Jun 1, 1999, 1:29 PM

Post #2 of 15

Shortcut

Re: copy and past script In reply to

Hi,

Something like this should work:

Code:
my $begin = '<p>' 
my $end   = '</p>'; 

use LWP::Simple; 
my $html = get ('http://www.yahoo.com/'); 
if ($html =~ /$begin(.+?)$end/o) { 
  print "Content is: $1"; 
} 
else { 
  print "Couldn't find the info!"; 
}

What this does is go to www.yahoo.com, retrieve the whole html page and store it in $html. You then look for the begin and end markers of the section you want and the info will be in $1 if found.

Hope this helps,

Alex

Jun 1, 1999, 4:13 PM

Jimz

User (207 posts)

Jun 1, 1999, 4:13 PM

Post #3 of 15

Shortcut

Re: copy and past script In reply to

Wow, LWP is much easier then using sockets. Should look into using it Smile

------------------

Jun 1, 1999, 9:00 PM

mellinger

User (77 posts)

Jun 1, 1999, 9:00 PM

Post #4 of 15

Shortcut

Re: copy and past script In reply to

Alex,
Would there be a way to modify this so that it could take the links from a yahoo category and put them in the links db and have them formatted the way links needs them?
Michael

Jun 1, 1999, 10:21 PM

Diemo

Novice (29 posts)

Jun 1, 1999, 10:21 PM

Post #5 of 15

Shortcut

Re: copy and past script In reply to

Hi Alex,

thanks for your help. Pleae don't call me an ideot but I am just very new to this script language. I try to use your sample but it won't work - internal server error. Here is what I have done:

#!/usr/bin/perl
my $begin = '<html>'
my $end = '</html>';
use LWP::Simple;
my $html = get ('http://www.yahoo.com/');
if ($html =~ /$begin(.+?)$end/o) {
print "Content is: $1";
}
else {
print "Couldn't find the info!";
}

To keep it simple I one again use yahoo as example.
This should normally capture the hole site, right ? It would be very helpfull if you can write down the hole script

Thanks so much ...

Diemo

Jun 1, 1999, 10:31 PM

dan

Enthusiast (760 posts)

Jun 1, 1999, 10:31 PM

Post #6 of 15

Shortcut

Re: copy and past script In reply to

Quick fix:

1. Add the following line before the if...else control block:

print "Content-type: text/html\n\n";

2. Change:

my $begin = '<html>'

to

my $begin = '<html>';

And make sure you saved and uploaded it in ASCII, and CHMODed (set file permissions) the script to 0755 - rwxr_xr_x.

Dan Smile

Jun 1, 1999, 10:48 PM

Diemo

Novice (29 posts)

Jun 1, 1999, 10:48 PM

Post #7 of 15

Shortcut

Re: copy and past script In reply to

Hi Don and Alex,

OK, now the server error is gone. Nevertheless the script seems not to find the begin/end- mark and write "Couldn't find the info!". Any ideas ?

Here it is:
#!/usr/bin/perl
my $begin = '<html>';
my $end = '</html>';
use LWP::Simple;
my $html = get ('http://www.yahoo.com/index.html');
print "Content-type: text/html\n\n";
if ($html =~ /$begin(.+?)$end/o) {
print "Content is: $1";
}
else {
print "Couldn't find the info!";
}

Bye Diemo

Jun 2, 1999, 5:01 AM

Diemo

Novice (29 posts)

Jun 2, 1999, 5:01 AM

Post #8 of 15

Shortcut

Re: copy and past script In reply to

Me again,

I saw another solution like:

#!/usr/local/bin/perl
use LWP::Simple;
#retrieve the page which should be modified
my $page = get("add URL");
$page =~ s/\n//g;
#everything above this will be cut off
$page =~ s/^.*<\/head>//is;
#everything below this will be cut off
$page =~ s/<\/body>.*$//is;
print "Content-type: text/html\n\n";
#print result
print "$page";

but since I don't know anything about sytax this script only works if you use the html formating like <head>, <body> etc.
I like to use any phrase in the HTML as upper and lower limit. I know the solution is at the following: eg. <\/head>//

but ...

PLEASE HELP ...

Bye Diemo

Jun 2, 1999, 6:44 AM

Alex

Administrator (9387 posts)

Jun 2, 1999, 6:44 AM

Post #9 of 15

Shortcut

Re: copy and past script In reply to

[pet peeve]You should learn perl before learning CGI. Re: the 500 server error question -- a 500 server error does not mean the script didn't work, it means that it didn't output the proper headers to the web server. Not all perl scripts are CGI scripts.[/pet peeve]

As for the problem, to help debugging, change the couldn't find line to:

print "Couldn't find info. Found this instead: $html";

then you can see what's happening. I suspect it's a case sensitivity issue, where they use <HTML> and you use <html>. You can either change your tags, or add a /i to the regular expression.

Hope this helps,

Alex

Jun 2, 1999, 7:50 AM

Diemo

Novice (29 posts)

Jun 2, 1999, 7:50 AM

Post #10 of 15

Shortcut

Re: copy and past script In reply to

To ALEX,

Hi,

the server error was the caused by one little mistake you made - which was already corrected by Dan.
Nevertheless, the $html var return exactly the same then the original page. Therefore I thought that the problem with your script was caused by the if-else paragraph ?! I cheched for case sensitivity but this was OK

Bye Diemo

Jun 6, 1999, 3:01 PM

Pasha

User (214 posts)

Jun 6, 1999, 3:01 PM

Post #11 of 15

Shortcut

Re: copy and past script In reply to

mellinger,

This will be "stealing".
I don't think that Yohoo will let any one do it without taking a legal action.

Regards,

Pasha

------------------
webmaster@find.virtualave.net
http://find.virtualave.net

Jun 7, 1999, 12:16 PM

mellinger

User (77 posts)

Jun 7, 1999, 12:16 PM

Post #12 of 15

Shortcut

Re: copy and past script In reply to

Pasha,
Sorry, I seriously thought you could do this. Sorry again.
Michael

Jun 8, 1999, 9:37 AM

Diemo

Novice (29 posts)

Jun 8, 1999, 9:37 AM

Post #13 of 15

Shortcut

Re: copy and past script In reply to

Hi mellinger,

I don't know Links but I think what you want to do is here_
http://www.gossamer-threads.com/scripts/misc/altavista.cgi

Bye Diemo

Jun 9, 1999, 9:23 PM

Cade

Novice (10 posts)

Jun 9, 1999, 9:23 PM

Post #14 of 15

Shortcut

Re: copy and past script In reply to

Mellinger, another option would be to pull links from the Open Source Directory Project. Granted, its not Yahoo!, but it seems to be growing faster than Yahoo! due to volunteer editors.
You can find a couple of scripts to do it at cgi-resources.com. Lycos and, I believe, Excite pull links from it currently to supplement their spidered indexes.

Jun 23, 1999, 12:51 AM

selgreg

New User (1 post)

Jun 23, 1999, 12:51 AM

Post #15 of 15

Shortcut

Re: copy and past script In reply to

Hi, All.

As I can imagine - I should copy simple.pm to my perl directory. Am I right? Or I just can copy it to cgi-bin?

Instead of using LWP/simple.pp - how to use sockets?

Sincerely yours,

------------------
=========================
Seleznev Gregory
http://come.to/sgdesign
devil@quake.ru