
taskinoor.hasan at csebuet
Nov 25, 2009, 8:37 PM
Post #4 of 6
(975 views)
Permalink
|
|
Re: How do I correctly download Wikipedia pages?
[In reply to]
|
|
I fetched a different problem. Whenever I tried to fetch any page from wikipedia, I received 403. Then I found that wikipedia don't accept the default user-agent (might be python-urllib2.x or something like this). After setting my own user-agent, it worked fine. You can try this if you receive 403. On Thu, Nov 26, 2009 at 10:04 AM, Stephen Hansen <apt.shansen [at] gmail>wrote: > > > 2009/11/25 Steven D'Aprano <steven [at] remove> > > I'm trying to scrape a Wikipedia page from Python. Following instructions >> here: >> >> > Have you checked out http://meta.wikimedia.org/wiki/Pywikipediabot? > > Its not just via urllib, but I've scraped several MediaWiki-based sites > with the software successfully. > > --S > > -- > http://mail.python.org/mailman/listinfo/python-list > >
|