Reputation: 442
Am using scrapy to extract data from a certain website, I have a field am extracting that returns both the city and the region I want to be able to split the returned data on the comma and store the first part of it inside the city field and second part of it in the region field The code am using to extract the data :
loader.add_css('region','.seller-box__seller-address__label::text')
the output of the data is : a column named region with for example this value :
Elbląg, Warmińsko-mazurskie
the desired output would be two columns the first being city with the value of : Elbląg and region with the value of : Warmińsko-mazurskie
UPDATE :
apprently the loader can take an additional arrgument for regular expressions : i was able to split the data by passing
loader.add_css('region','.seller-box__seller-address__label::text',re='([^,]+)$')
This will remove everything before the comma.
Upvotes: 0
Views: 334
Reputation: 142651
I don't know if loader
has special method for split value into two fields.
Normally I would do
text = response.css('.seller-box__seller-address__label::text').extract_first().strip()
city, region = tex.split(', ')
loader.add_value('city', city)
loader.add_value('region', region)
Upvotes: 1