user1858027
user1858027

Reputation: 1007

Is there any manual way of sending items to pipeline in scrapy

I am having few problems in sending my items in pipeline as my request is going through several functions.

I just want that is there any manual way of sending item objects to scrapy pipeline. Because i don't know the internal details of the scrapy.

Suppose i have the function called

def parseDetails(self, response):

  item = DmozItem()
  item['test'] = "mytest"

  sendToPiepline(piplineName , item)

Upvotes: 2

Views: 1448

Answers (2)

James Taylor
James Taylor

Reputation: 6268

If you delegate directly to the ItemPipelineManager, you will raise unhandled exceptions in the manager:

[2018-07-21 20:00:02] CRITICAL - Unhandled error in Deferred:

[2018-07-21 20:00:02] CRITICAL -
Traceback (most recent call last):
  File "/home/vagrant/.local/share/virtualenvs/vagrant-gKDsaKU3/lib/python3.6/site-packages/twisted/internet/defer.py", line 654, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/vagrant/monitor/pipelines/filter.py", line 24, in process_item
    raise DropItem()
scrapy.exceptions.DropItem

This also might unintentionally alter the state of the pipeline and affect processing.

I think the better approach is the grab the Pipeline instance you're looking for, and call it directly:

try:
    # Manually call the filter
    f = utils.get_pipeline_instance(self, FilterPipeline)
    f.process_item(p, self)
except DropItem:
    pass

Using a helper function:

def get_pipeline_instance(spider, pipeline_class):
    manager = spider.crawler.engine.scraper.itemproc
    for pipe in manager.middlewares:
        if isinstance(pipe, pipeline_class):
            return pipe
    else:
        raise NotConfigured('Invalid pipeline')

Upvotes: 0

imwilsonxu
imwilsonxu

Reputation: 3002

scrapy/commands/parse.py:

def parseDetails(self, response):
  item = DmozItem()
  item['test'] = "mytest"

  # Call pipeline.
  itemproc = self.crawler.engine.scraper.itemproc
  itemproc.process_item(item, self)

  return item

Upvotes: 2

Related Questions