Hi! Does anyone have any idea on extract the URL's (of certain type of file) out of a webpage and put it in a flat text database?
For example:
file.html containing the following contents:
<a href="../files/num_1.zip">1st one</a>
<a href="http://www.fff.com/files/num_2.zip">2nd one</a>
process and put into spidered.db:
1st one|../files/num_1.zip
2nd one|http://www.fff.com/files/num_2.zip
I know this is kinda complicated. I tried but nothing works:(
THANKS IN ADVICE!!!
For example:
file.html containing the following contents:
<a href="../files/num_1.zip">1st one</a>
<a href="http://www.fff.com/files/num_2.zip">2nd one</a>
process and put into spidered.db:
1st one|../files/num_1.zip
2nd one|http://www.fff.com/files/num_2.zip
I know this is kinda complicated. I tried but nothing works:(
THANKS IN ADVICE!!!