Gossamer Forum
Home : Products : Gossamer Links : Development, Plugins and Globals :

Extracting text from a webpage

Quote Reply
Extracting text from a webpage
I'm working on programming a PERL subroutine or global to extract a text sample of a webpage. Would supplement the "Description" meta tag, etc info.

Specifically, how would I extract the first 200 text chars, (broken on word boundary) digest of a web page using LINKS SQL 2.12, stripped of HTML codes.

I will need this to run through all "200' (GOOD PAGE) valid links and insert to the links database in a custom field, let's say "page_extract".
Same goes for Title and Description meta tags, and email, but there may be code examples for that ...

thanks!
Subject Author Views Date
Thread Extracting text from a webpage webslicer 2245 Apr 27, 2003, 2:45 PM
Thread Re: [webslicer] Extracting text from a webpage
Paul 2152 Apr 27, 2003, 4:24 PM
Post Re: [Paul] Extracting text from a webpage
webslicer 2137 Apr 27, 2003, 4:40 PM