Python Scrapy - dynamic HTML, div and span content needed

Question

So I'm new to Scrapy and am looking to do something which is proving a little too ambitious. I'm hoping somebody out there can help guide me on how to gather and parse the info I'm after from this website.

I need to obtain the following: label1 4810 (this is generated dynamically) Business name Name Address1 Address2 Address3 Address4 Postcode 0800 111111 me@domain.com

Is this even possible using scrapy?

Many thanks in advance.


   Label13345
  
  
    4810 
	    


  
    
      Info
      
    
    Business name
    
      
        
        Contact details
      
      
        
          Name
          Address1
          Address2
          Address3
          Address4
          Postcode
        
        
          


          
            Phone:
            0800 111111
          
          
            Email:
            me@domain.com

alex10 · Accepted Answer

An example of parsing the already received page might look something like this:

import lxml.html

page=""" . . . """
doc = lxml.html.document_fromstring(page)

# get label1 4810
label = doc.cssselect('.mbg .mbg-l a')[0].text_content()
# get address
addres = doc.cssselect('.u-flL .bsi-c1')[0].text_content()
# get phone
phone = doc.cssselect('.bsi-c2 .bsi-lbl')[0].text_content()
# get mail      
mail = doc.cssselect('.bsi-c2 .bsi-lbl')[1].text_content()

if a page must be retrieved from the network can make so:

import requests, lxml.html

page  = requests.get('site_.com')
doc   = lxml.html.document_fromstring(page.text)
phone = doc.cssselect('.bsi-c2 .bsi-lbl')[0].text_content()

Python Scrapy - dynamic HTML, div and span content needed

Answers (1)

Related Questions