
highcar at gmail
Nov 15, 2009, 9:29 AM
Post #3 of 3
(144 views)
Permalink
|
Jon Clements-2 wrote: > > On Nov 15, 1:08Â pm, elca <high...@gmail.com> wrote: >> hello , these day im very stress of one of some strange thing. >> >> i want to enumurate inside list of url, and every enumurated url i want >> to >> visit >> >> i was uplod incompleted script source in here => >> >> http://elca.pastebin.com/m6f911584 >> >> if anyone can help me really appreciate >> >> thanks in advance >> >> Paul >> >> -- >> View this message in >> context:http://old.nabble.com/python-win32com-problem-tp26358976p26358976.html >> Sent from the Python - python-list mailing list archive at Nabble.com. > > How much effort have you put into this? It looks like you've just > whacked together code (that isn't valid -- where'd the magical > 'buttons' variable come from), given up and cried for help. > > Besides, I would suggest you're taking completely the wrong route. > You'll find it one hell of a challenge to automate a browser as you > want, that's if it supports exposing the DOM anyway. And without being > rude, would definitely be beyond your abilities from your posts to > c.l.p. > > Download and install BeautifulSoup from > http://www.crummy.com/software/BeautifulSoup/ > - you seem to have quite a few HTML based needs in your pastebin, so > it'll come in useful for the future. > > Here's a snippet to get you started: > > from urllib2 import urlopen > from BeautifulSoup import BeautifulSoup as BS > > url = urlopen('http://news.naver.com/main/presscenter/category.nhn') > urldata = url.read() > soup = BS(urldata) > atags = soup('a', attrs={'href': lambda L: L and L.startswith('http:// > news.khan.co.kr')}) > for atag in atags: > print atag['href'] > > I'll leave it to you where you want to go from there (ie, follow the > links, or automate IE to open said pages etc...) > > I strongly suggest reading the urllib2 and BeautifulSoup docs, and > documenting the above code snippet -- you should then understand it, > should be less stressed, and have something to refer to for similar > requirements in the future. > > hth, > Jon. > -- > http://mail.python.org/mailman/listinfo/python-list > > Hello, thanks for your kind reply. your script is working very well im making scraper now. and im making with PAMIE but still slow module but i have no choice because of javascript support. before i was try to look for method with mechanize but almost failed. if mechanize can support javascript maybe my best choice will be mechanize. ok anyway..there is almost no choice so i have to go "automate IE to open said pages etc.." i want to visit every collect link with IE com interface.. for example i was collect 10 url ...i want to visit every 10 url. would you help me some more? if so much appreciate thanks -- View this message in context: http://old.nabble.com/python-win32com-problem-tp26358976p26361229.html Sent from the Python - python-list mailing list archive at Nabble.com. -- http://mail.python.org/mailman/listinfo/python-list
|