Well it still doesn't work then. I've figured out where the problem is now, but don't know how to solve it.
I'll paste a bit of html where the <br>'s should be taken away, the problem are the line endings etc.
I tried doing this:
in stead of:
Code:
$rec{'Text'} =~ s%<pre>(.*?)<br>(.*?)</pre>%<pre>$1 $2</pre>%gim;
I tried:
Code:
$rec{'Text'} =~ s%<pre>((.|\n)*?)<br>((.|\n)*?)</pre>%<pre>$1 $2</pre>%gim;
But that would just erase everything between <pre> and </pre> in the next example:
Code:
<br><b>Medische reden WAO-uitkering, in percentages</b>
<br><pre>
Turken Marokkanen Nederlanders
<br>Klachten aan het bewegingsapparaat 36 35 36
<br>Psychische klachten 23 26 27
<br>Overig 41 39 37
<br></pre>
hmmm... So what I actually want the script to do is the following:
look for <pre> and </pre> and erase all <br> that you find within it, no matter what you find. However: leave the rest!
But I don't know how to do it properly.
(still studying 'programming perl')
Thanks for your time anyway!