Mixy
Mixy

Reputation: 69

Python JSON scraping - how can I handle missing values?

I'm pretty new to coding, so I'm learning a lot as I go. This problem got me stumped, and even though I can find several similar questions on here, I can't find one that works or has a recognizable syntax to me.

I'm trying to scrape various user data from a JSON API, og then store those values in a MySQL database I've set up.

The code seems to run fine for the most part, but some users does not have the attributes I'm trying to scrape in the JSON, and thus I'm left with Nonetype errors that I cant seem to foil.

If possible I'd like to just store "0" in the database where the json does not contain the attribute.

In the m/snippet below this works fine for users that has a job, but users without a job returns Nonetype on jobposition and apparently breaks the loop.

response = requests.get("URL")
json_obj = json.loads(response.text)

timer = json_obj['timestamp']
jobposition = json_obj['job']['position']

query = "INSERT INTO users (timer, jobposition) VALUES (%s, %s)"
values = (timer, jobposition)

cursor = db.cursor()
cursor.execute(query, values)
db.commit()

Thanks in advance!

Upvotes: 2

Views: 1342

Answers (3)

Evgeniy_Burdin
Evgeniy_Burdin

Reputation: 703

You can more clearly declare the data schema using dataclasses:

from dataclasses import dataclass

from validated_dc import ValidatedDC


@dataclass
class Job(ValidatedDC):
    name: str
    position: int = 0


@dataclass
class Workers(ValidatedDC):
    timer: int
    job: Job


input_data = {
    'timer': 123,
    'job': {'name': 'driver'}
}

workers = Workers(**input_data)

assert workers.job.position == 0

https://github.com/EvgeniyBurdin/validated_dc

Upvotes: 0

Leo Arad
Leo Arad

Reputation: 4472

You can use for that the get() method of the dictionary as follow

timer = json_obj.get('timestamp', 0)

0 is the default value and in case there is no 'timestamp' attribute it will return 0. For job position, you can do

jobposition = json_obj['job'].get('position', 0) if 'job' in json_obj else 0

Upvotes: 3

sushmanth natha
sushmanth natha

Reputation: 96

Try this

try:
 jobposition = json_obj['job']['position']
except:
 jobposition = 0

Upvotes: 0

Related Questions