Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Python: Python

Scrapy - importing files from local, rather than www

 

 

Python python RSS feed   Index | Next | Previous | View Threaded


bwalker at itree

May 7, 2012, 12:57 AM

Post #1 of 2 (84 views)
Permalink
Scrapy - importing files from local, rather than www

Hi everyone, I'm new to Python (loving it!) and Scrapy. I have a
question I just can't seem to get my head around. I can get a simple
Scrapy spider to pick up URLs and download them fine, but the HTML
files I have are stored locally. The reason for this, is for some
reason when I "Save As" the pages I get everything, whereas if Scrapy
runs over them it seems to miss certain areas where there's
Javascript.

So, I have them sitting in a directory (C:/scrapy_test) but can't for
the life of me get Scrapy to find them. Is there anyone who's had this
problem and solved it, or can help?

Any help is much appreciated.
Kind regards,
nbw
--
http://mail.python.org/mailman/listinfo/python-list


leomcallister at gmail

May 11, 2012, 12:53 PM

Post #2 of 2 (70 views)
Permalink
Re: Scrapy - importing files from local, rather than www [In reply to]

You can try running Python's web server on the folder (python -m SimpleHTTPServer) and point Scrapy to it.

On Monday, May 7, 2012 4:57:22 AM UTC-3, nbw wrote:
> Hi everyone, I'm new to Python (loving it!) and Scrapy. I have a
> question I just can't seem to get my head around. I can get a simple
> Scrapy spider to pick up URLs and download them fine, but the HTML
> files I have are stored locally. The reason for this, is for some
> reason when I "Save As" the pages I get everything, whereas if Scrapy
> runs over them it seems to miss certain areas where there's
> Javascript.
>
> So, I have them sitting in a directory (C:/scrapy_test) but can't for
> the life of me get Scrapy to find them. Is there anyone who's had this
> problem and solved it, or can help?
>
> Any help is much appreciated.
> Kind regards,
> nbw

--
http://mail.python.org/mailman/listinfo/python-list

Python python RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.