Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Python

python win32com problem

 

 

Python python RSS feed   Index | Next | Previous | View Threaded


highcar at gmail

Nov 15, 2009, 5:08 AM

Post #1 of 3 (166 views)
Permalink
python win32com problem

hello , these day im very stress of one of some strange thing.

i want to enumurate inside list of url, and every enumurated url i want to
visit

i was uplod incompleted script source in here =>

http://elca.pastebin.com/m6f911584

if anyone can help me really appreciate

thanks in advance

Paul

--
View this message in context: http://old.nabble.com/python-win32com-problem-tp26358976p26358976.html
Sent from the Python - python-list mailing list archive at Nabble.com.

--
http://mail.python.org/mailman/listinfo/python-list


joncle at googlemail

Nov 15, 2009, 7:36 AM

Post #2 of 3 (145 views)
Permalink
Re: python win32com problem [In reply to]

On Nov 15, 1:08 pm, elca <high...@gmail.com> wrote:
> hello , these day im very stress of one of some strange thing.
>
> i want to enumurate inside list of url, and every enumurated url i want to
> visit
>
> i was uplod incompleted script source in here =>
>
> http://elca.pastebin.com/m6f911584
>
> if anyone can help me really appreciate
>
> thanks in advance
>
> Paul
>
> --
> View this message in context:http://old.nabble.com/python-win32com-problem-tp26358976p26358976.html
> Sent from the Python - python-list mailing list archive at Nabble.com.

How much effort have you put into this? It looks like you've just
whacked together code (that isn't valid -- where'd the magical
'buttons' variable come from), given up and cried for help.

Besides, I would suggest you're taking completely the wrong route.
You'll find it one hell of a challenge to automate a browser as you
want, that's if it supports exposing the DOM anyway. And without being
rude, would definitely be beyond your abilities from your posts to
c.l.p.

Download and install BeautifulSoup from http://www.crummy.com/software/BeautifulSoup/
- you seem to have quite a few HTML based needs in your pastebin, so
it'll come in useful for the future.

Here's a snippet to get you started:

from urllib2 import urlopen
from BeautifulSoup import BeautifulSoup as BS

url = urlopen('http://news.naver.com/main/presscenter/category.nhn')
urldata = url.read()
soup = BS(urldata)
atags = soup('a', attrs={'href': lambda L: L and L.startswith('http://
news.khan.co.kr')})
for atag in atags:
print atag['href']

I'll leave it to you where you want to go from there (ie, follow the
links, or automate IE to open said pages etc...)

I strongly suggest reading the urllib2 and BeautifulSoup docs, and
documenting the above code snippet -- you should then understand it,
should be less stressed, and have something to refer to for similar
requirements in the future.

hth,
Jon.
--
http://mail.python.org/mailman/listinfo/python-list


highcar at gmail

Nov 15, 2009, 9:29 AM

Post #3 of 3 (144 views)
Permalink
Re: python win32com problem [In reply to]

Jon Clements-2 wrote:
>
> On Nov 15, 1:08 pm, elca <high...@gmail.com> wrote:
>> hello , these day im very stress of one of some strange thing.
>>
>> i want to enumurate inside list of url, and every enumurated url i want
>> to
>> visit
>>
>> i was uplod incompleted script source in here =>
>>
>> http://elca.pastebin.com/m6f911584
>>
>> if anyone can help me really appreciate
>>
>> thanks in advance
>>
>> Paul
>>
>> --
>> View this message in
>> context:http://old.nabble.com/python-win32com-problem-tp26358976p26358976.html
>> Sent from the Python - python-list mailing list archive at Nabble.com.
>
> How much effort have you put into this? It looks like you've just
> whacked together code (that isn't valid -- where'd the magical
> 'buttons' variable come from), given up and cried for help.
>
> Besides, I would suggest you're taking completely the wrong route.
> You'll find it one hell of a challenge to automate a browser as you
> want, that's if it supports exposing the DOM anyway. And without being
> rude, would definitely be beyond your abilities from your posts to
> c.l.p.
>
> Download and install BeautifulSoup from
> http://www.crummy.com/software/BeautifulSoup/
> - you seem to have quite a few HTML based needs in your pastebin, so
> it'll come in useful for the future.
>
> Here's a snippet to get you started:
>
> from urllib2 import urlopen
> from BeautifulSoup import BeautifulSoup as BS
>
> url = urlopen('http://news.naver.com/main/presscenter/category.nhn')
> urldata = url.read()
> soup = BS(urldata)
> atags = soup('a', attrs={'href': lambda L: L and L.startswith('http://
> news.khan.co.kr')})
> for atag in atags:
> print atag['href']
>
> I'll leave it to you where you want to go from there (ie, follow the
> links, or automate IE to open said pages etc...)
>
> I strongly suggest reading the urllib2 and BeautifulSoup docs, and
> documenting the above code snippet -- you should then understand it,
> should be less stressed, and have something to refer to for similar
> requirements in the future.
>
> hth,
> Jon.
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>

Hello,
thanks for your kind reply.
your script is working very well
im making scraper now.
and im making with PAMIE but still slow module
but i have no choice because of javascript support.
before i was try to look for method with mechanize but almost failed.
if mechanize can support javascript maybe my best choice will be mechanize.
ok anyway..there is almost no choice so i have to go
"automate IE to open said pages etc.."
i want to visit every collect link with IE com interface..
for example i was collect 10 url ...i want to visit every 10 url.
would you help me some more?
if so much appreciate thanks
--
View this message in context: http://old.nabble.com/python-win32com-problem-tp26358976p26361229.html
Sent from the Python - python-list mailing list archive at Nabble.com.

--
http://mail.python.org/mailman/listinfo/python-list

Python python RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.