M. Coppee
M. Coppee

Reputation: 147

Issue with converting string to object

I have a json file from which I would like to extract the data-estimated-earnings attribute from the a element. The Attribute contains an object from which I would like to extract the open_eligible key value.

Here is the starting JSON:

{"html":"<div class='car_model_estimation_result__container'>\n<div class='car_model_estimation_result cobalt-mb-tight'>\n<div class='car_model_estimation_result__item'>\n<span class=\"car_model_estimation_result_amount\">720€</span>\n<p class='cobalt-text-sectionHeader'>\n<span>maximum estimés par mois</span>\n<span class='cobalt-mb-unit cobalt-Icon cobalt-Icon--size16 cobalt-Icon--colorGraphiteLight'>\n<a class=\"js_popup_trigger\" href=\"#estimate_about_with_open\"><svg viewBox=\"0 0 24 24\" xmlns=\"http://www.w3.org/2000/svg\">\n  <path d=\"M11 9h2V7h-2v2zm1 11c-4.41 0-8-3.59-8-8s3.59-8 8-8 8 3.59 8 8-3.59 8-8 8zm0-18C6.477 2 2 6.477 2 12A10 10 0 1 0 12 2zm-1 15h2v-6h-2v6z\" />\n</svg>\n\n</a></span>\n</p>\n\n</div>\n<div class='owner_homepage_hero_estimation_cta__container'>\n<a class=\"owner_homepage_hero_estimation_cta--fullWidth cobalt-Button cobalt-Button--primary cobalt-Button--large js_rent_my_car js_rent_my_car_top js_estimation_result\" rel=\"nofollow\" data-tracking-params=\"{&quot;model_id&quot;:&quot;1519&quot;,&quot;brand_id&quot;:&quot;67&quot;,&quot;mileage&quot;:4,&quot;city&quot;:&quot;Anvers&quot;,&quot;release_year&quot;:2016,&quot;open_eligible&quot;:true,&quot;currency&quot;:&quot;EUR&quot;,&quot;earnings&quot;:720,&quot;earnings_period&quot;:&quot;month&quot;}\" data-click-location=\"top\" data-estimated-earnings=\"{&quot;model_id&quot;:&quot;1519&quot;,&quot;release_year&quot;:2016,&quot;mileage&quot;:4,&quot;within_eligible_area&quot;:true,&quot;open_eligible&quot;:true}\" href=\"/choose_open_or_standard?mileage=4&amp;model_id=1519&amp;open_eligible=true&amp;release_year=2016&amp;within_eligible_area=true\">Inscrire ma voiture</a>\n</div>\n</div>\n</div>\n"}

Here is my python code for extracting what I need:

import json
from parsel import Selector

with open('C:/Users/coppe/Documents/py trials/estimated_earnings.json') as json_file:  
    earnings = json.load(json_file)
selector = Selector(earnings['html'])
eligibleObj = json.loads(json.dumps(selector.css('a::attr(data-estimated-earnings)').get()))
print(eligibleObj['open_eligible'])

The issue is that I get this error:

print(eligibleObj['open_eligible'])

TypeError: string indices must be integers

Does anyone know how to convert the object in the data-estimated-earnings attribute to an object and then extracting what I need ?

Upvotes: 1

Views: 70

Answers (2)

CristiFati
CristiFati

Reputation: 41187

selector.css('a::attr(data-estimated-earnings)').get() returns a dictionary which is already in form of a string (json serialized), so you must not call json.dumps on it:

>>> import json
>>> from parsel import Selector
>>>
>>> earnings = {"html":"<div class='car_model_estimation_result__container'>\n<div class='car_model_estimation_result cobalt-mb-tight'>\n<div class='car_model_estimation_result__item'>\n<span class=\"car_model_estimation_result_amount\">720€</span>\n<p class='cobalt-text-sectionHeader'>\n<span>maximum estimés par mois</span>\n<span class='cobalt-mb-unit cobalt-Icon cobalt-Icon--size16 cobalt-Icon--colorGraphiteLight'>\n<a class=\"js_popup_trigger\" href=\"#estimate_about_with_open\"><svg viewBox=\"0 0 24 24\" xmlns=\"http://www.w3.org/2000/svg\">\n  <path d=\"M11 9h2V7h-2v2zm1 11c-4.41 0-8-3.59-8-8s3.59-8 8-8 8 3.59 8 8-3.59 8-8 8zm0-18C6.477 2 2 6.477 2 12A10 10 0 1 0 12 2zm-1 15h2v-6h-2v6z\" />\n</svg>\n\n</a></span>\n</p>\n\n</div>\n<div class='owner_homepage_hero_estimation_cta__container'>\n<a class=\"owner_homepage_hero_estimation_cta--fullWidth cobalt-Button cobalt-Button--primary cobalt-Button--large js_rent_my_car js_rent_my_car_top js_estimation_result\" rel=\"nofollow\" data-tracking-params=\"{&quot;model_id&quot;:&quot;1519&quot;,&quot;brand_id&quot;:&quot;67&quot;,&quot;mileage&quot;:4,&quot;city&quot;:&quot;Anvers&quot;,&quot;release_year&quot;:2016,&quot;open_eligible&quot;:true,&quot;currency&quot;:&quot;EUR&quot;,&quot;earnings&quot;:720,&quot;earnings_period&quot;:&quot;month&quot;}\" data-click-location=\"top\" data-estimated-earnings=\"{&quot;model_id&quot;:&quot;1519&quot;,&quot;release_year&quot;:2016,&quot;mileage&quot;:4,&quot;within_eligible_area&quot;:true,&quot;open_eligible&quot;:true}\" href=\"/choose_open_or_standard?mileage=4&amp;model_id=1519&amp;open_eligible=true&amp;release_year=2016&amp;within_eligible_area=true\">Inscrire ma voiture</a>\n</div>\n</div>\n</div>\n"}
>>>
>>> selector = Selector(earnings['html'])
>>> selector
<Selector xpath=None data='<html><body><div class="car_model_estima'>
>>>
>>> css = selector.css('a::attr(data-estimated-earnings)').get()
>>> type(css), css
(<class 'str'>, '{"model_id":"1519","release_year":2016,"mileage":4,"within_eligible_area":true,"open_eligible":true}')
>>>
>>> eligible_obj = json.loads(css)
>>> eligible_obj
{'model_id': '1519', 'release_year': 2016, 'mileage': 4, 'within_eligible_area': True, 'open_eligible': True}
>>> eligible_obj['open_eligible']
True

Translated to your code, it should be:

eligibleObj = json.loads(selector.css('a::attr(data-estimated-earnings)').get())

, but I'd say not to do too many operations in one line, as things might get confusing :) .

Upvotes: 3

sashaboulouds
sashaboulouds

Reputation: 1864

You eligibleObj is a string, that looks like that:

'{"model_id":"1519","release_year":2016,"mileage":4,"within_eligible_area":true,"open_eligible":true}'

You have to do:

>>> print(json.loads(eligibleObj)['open_eligible'])
True

Upvotes: 0

Related Questions