Gossamer Forum
Home : General : Internet Technologies :

html --> Glinks import

Quote Reply
html --> Glinks import
Hi

I have a legal site with 30K html pages (of judgments from courts) in fixed format. There are certain documents which have tables too. Is there anything through which following can be done?

Split/Convert the html pages for import of content into different fields in Glinks3. Or is this entirely impossible and would require manual work of cut and paste and submit for each judgment.

Need some input on this.

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================
Quote Reply
Re: [HyperTherm] html --> Glinks import In reply to
Would something like this work (thanks to Andy from a previous post):

Example:

Section 1:
A: This is the text in section 1 a
B: This is the text in section 1 b
C: This is the text in section 1 c

if ($Line =~ m/\QSection 1\E(.*?)\QB:\E(.*?)\QC:\E/i) {
print $2;
}

(which should give you "B: This is the text in section 1 b")

Could you incorporate this into links to open and read an html file and look for the sections you need or am I totally missing the point?
Quote Reply
Re: [Watts] html --> Glinks import In reply to
Hi

Thanks.
Yes its something like that.
It's almost 12 columns in links table which has to be populated ... at least that's what we are thinking of ... still in thought stage :-)

Thanks
HyTC
==================================
Mail Me If Contacting Privately Is That Necessary.
==================================