Gossamer Forum
Home : General : Perl Programming :

How To Get The Page Title

Quote Reply
How To Get The Page Title
I know I can pickup a web page's address using $ENV{'HTTP_REFERER'} but how do I get the title of a page? For example, I have a script that I want to link to from various pages that have various titles. Is there anyway to pass the title of the page to the script?

To elaborate, I need to pick up what is between the <title> and </title> tags. Is there a way?

[This message has been edited by Bobsie (edited June 04, 1999).]
Quote Reply
Re: How To Get The Page Title In reply to
Nothing that get's passed into the script automatically would help. You would have to somehow pass the page title into the script, perhaps through the URL.

If you describe a little bit more how the script will work and what sort of control you have on the pages that are calling it, that might help.

Cheers,

Alex
Quote Reply
Re: How To Get The Page Title In reply to
Bobsie,

You could try downloading the remote HTML file into the memory, then search the first five lines for the Title tags.

Regards,

Pasha

------------------
webmaster@find.virtualave.net
http://find.virtualave.net
Quote Reply
Re: How To Get The Page Title In reply to
Alex and Pasha,

These are my own pages and a script running on my own server. The script is the birdcast.cgi script which I have setup on a link on each Links category pages as part of the site menu. It is just a link to the birdcast.cgi script. The script gets the address of the page from $ENV{'HTTP_REFERER'} but not the title of the page. I haven't been able to figure out how to send the title to the script or have the script ascertain the title once it is running.

So, when I display the HTML page that contains the "Recommend This Page" form, all I can display for the page being recommended is the URL. I would rather display a Hyperlink anchor with the Title being displayed and the URL used as the link.

Any ideas?
Quote Reply
Re: How To Get The Page Title In reply to
Since they are all on your server, you could open the file up and grab the title. Something like:

Code:
use URI::URL;

my $doc_root = '/path/to/webserver/docroot';

my $url = new URI::URL $ENV{'HTTP_REFERER'};
my $path = $url->path;

($path and (-e "$doc_root/$path")) or die "Unkown file: $path";

my $title;
open (FILE, "$doc_root/$path") or die "open: $!";
while (<FILE> ) {
m,<title>(.+?)</title>, or next;
($title = $1) and last;
}
close FILE;
$title &#0124; &#0124;= 'No title found';

Let me know if that doesn't make sense. It's untested and may need some fine tweaking, but should be close.

Cheers,

Alex
Quote Reply
Re: How To Get The Page Title In reply to
Alex,

Thanks a lot! It works perfectly. Once again, I am in your debt.

Bob
Quote Reply
Re: How To Get The Page Title In reply to
Bobsie,

Where in the script did you add the above code? Did you do any additional modifications. Would you please pass on the instructions?

Greatly appreciated.
Quote Reply
Re: How To Get The Page Title In reply to
I haven't actually installed it in birdcast.cgi in a live environment yet. I did get it working though, enough so that I can begin a detailed install of birdcast.cgi v2. But, to answer your question, I put it at the end of sub decode_vars and I modified it a bit to use $site_title instead of $title.

Then I modified the line in sub draw_request that used to read:

Quote:
<A HREF="$ENV{'HTTP_REFERER'}">$ENV{'HTTP_REFERER'}</A>

to read:

Quote:
<A HREF="$ENV{'HTTP_REFERER'}">$site_title</A>

It works everytime for me.

One caveat that I learned real fast though... do not attempt to recommend a cgi generated dynamic page. It will not work. The pages to recommend must be static.

[This message has been edited by Bobsie (edited June 09, 1999).]
Quote Reply
Re: How To Get The Page Title In reply to
Bobsie,

Thanks for the info.
Quote Reply
Re: How To Get The Page Title In reply to
Hi all!

I tried to install birdcast.cgi 2.0 and Bobsies' mod, and everything runs smoothly, except that Site title, URL, and description INSIDE the sent message are empty!


Please help!
Check www.camcities.com/ to try for yourself!


------------------
Alex Tutusaus
Atyc WebDesigns
http://www.webcamworld.com/
Quote Reply
Re: How To Get The Page Title In reply to
Alex,

This discussion had nothing to do with the Recommend A Link mod I wrote. This was about modifying the original birdcast.cgi "Recommend a Page" script to show the page title instead of just the URL of the page being recommended.

Besides, you already emailed me about your problems with my mod (and I have already replied).
Quote Reply
Re: How To Get The Page Title In reply to
"while (<FILE> ) {
m,<title>(.+?)</title>, or next;
($title = $1) and last;
}"

Beginner here. We are looking for a match, what is the range of "(.+?)", characters, lines or what? If not multiple lines what would I use for that?
Quote Reply
Re: How To Get The Page Title In reply to
As I understand the code, the range is everything between <title> and </title> even if it is on multiple lines. Although, using multiple lines for a title is really not a good idea since the browser makes it all one line anyway (at least, Netscape does).
Quote Reply
Re: How To Get The Page Title In reply to
Dang etc. Finally found a search and replace .pl script and I can use (.+?) as it is being used in <TITLE>(.+?)</TITLE> (typically a single line) but I want all the text/lines between <FORM> and </FORM> and it doesn't work there.

...Perl default is a single line but other things going on as well. If I understand it the following script is reading line by line so any match/substitution is going to be limited to single lines, and might explain why all my experiments with /s, /m, (?s), (?m), etc. haven't worked. Any ideas?

Code:
open ($file, "<$file") | | &CgiDie ("unable to open $file");
@LINES = <$file>;

foreach $line (@LINES){
$line =~ s/$item_to_change/$new_line/g && $edit_count++;
$new_data .= "$line";
}
close ($file);

s/$item_to_change/$new_line/g is where the form input variables come to play. I have sucsessfully used (.+?) in $item_to_change but it doesn't always work.

[This message has been edited by Dave (edited June 27, 1999).]