Reputation: 816
I am trying to get this to work with Scrapy and it is being really frustrating. I can't import the items.py file. I have tried everything, including adding from__future__ import absolute import
and destroying and recreating the project and spider with different names a couple of times.
from __future__ import absolute_import
import scrapy
from kano.items import KanoItem
class KatscrapSpider(scrapy.Spider):
name = "katscrap"
allowed_domains = ["kat.cr"]
start_urls = (
'https://kat.cr/usearch/category%3Amusic/2/?field=seeders&sorder=desc',
)
def parse(self, response):
self.log("link: %s" % response.xpath(
'//*[@id][starts-with(@id,"torrent")]/td[1]/div[1]/a[4]//@href').extract())
item['torrent_url'] = response.xpath(
'//*[@id][starts-with(@id,"torrent")]/td[1]/div[1]/a[4]//@href').extract()
But I still get :
ImportError: No module named kano.items
This seems to be a kind of common error with scrapy, can someone explain why this happens ?
EDIT :
This is my tree structure :
├── kano
│ ├── __init__.py
│ ├── __init__.pyc
│ ├── items.py
│ ├── pipelines.py
│ ├── settings.py
│ ├── settings.pyc
│ └── spiders
│ ├── __init__.py
│ ├── __init__.pyc
│ └── kat.py
└── scrapy.cfg
Upvotes: 0
Views: 785
Reputation: 3386
Use scrapy crawl katscrap
while running the spider instead python kat.py
. This is happening because when you invoke the command python kat.py
it searches the kano
module in your current directory instead of the previous directory.
Upvotes: 1