vikrocx
vikrocx

Reputation: 53

scrapy-linkedin for LinkedIn data extraction

I'm using scrapy-0.16 for data extraction from LinkedIn.

    from scrapy.selector import HtmlXPathSelector
    from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
    from scrapy.contrib.spiders import CrawlSpider, Rule
    from scrapy.http import Request
    from scrapy import log
    from linkedin.items import LinkedinItem, PersonProfileItem
    from os import path
    from linkedin.parser.HtmlParser import HtmlParser
    import os
    import urllib
    from bs4 import UnicodeDammit
    from linkedin.db import MongoDBClient

https://github.com/pondering/scrapy-linkedin

The error comes

Traceback (most recent call last):
  File "C:\Users\TAWANE DUDEZ\Desktop\linkedin\linkedin\spiders\LinkedinSpider.py", line 6, in <module>
    from linkedin.items import LinkedinItem, PersonProfileItem
ImportError: No module named linkedin.items

Cannot find linkedin.items module.

Upvotes: 4

Views: 2124

Answers (1)

Talvalin
Talvalin

Reputation: 7889

My suspicion is that you're trying to run the scrapy crawl LinkedinSpider command from the wrong directory. Try navigating to C:\Users\TAWANE DUDEZ\Desktop\linkedin and then running the command again.

Since the crawler is now starting, you also need to be running a MongoDB instance before starting the crawl. The README of the github project being used says to typemongod to start an instance. Just to check, you do have MongoDB and pymongo installed right?

Upvotes: 3

Related Questions