Reputation: 724
I'm trying to modify an entire column of values, but I keep having issues with the list being out of range. This is my entire code:
# Libraries
import json, requests
import pandas as pd
from pandas.io.json import json_normalize
# Set URL
url = 'https://api-v2.themuse.com/jobs'
# For loop to extract data
for i in range(100):
data = json.loads(requests.get(
url=url,
params={'page': i}
).text)['results']
# JSON to PANDAS
data_norm = pd.read_json(json.dumps(data))
# Modify two columns' values
data_norm.locations = data_norm.locations.apply(lambda x: [{x[0]['name']}])
data_norm.publication_date = pd.to_datetime(data_norm.publication_date)
The problem here is that when I use the function
data_norm.locations = data_norm.locations.apply(lambda x: [{x[0]['name']}])
I get the following error:
IndexError: list index out of range
Ideally, I want to change location
column from this:
0 [{'name': 'Seattle, WA'}]
1 [{'name': 'San Francisco Bay Area'}]
2 [{'name': 'Palo Alto, CA'}]
3 [{'name': 'Reno, NV'}]
4 []
Name: locations, dtype: object
into this:
0 Seattle, WA
1 San Francisco Bay Area
2 Palo Alto, CA
3 Reno, NV
4
Name: locations, dtype: object
Upvotes: 2
Views: 4291
Reputation: 1846
data_norm.locations = data_norm.locations.apply(lambda x:
[{x[0].get('name', '')}]
if len(x) > 0 else []
)
Note that this assumes that if that entry contains at least one element, the first element is a dictionary. The issue with your code is that you tried to access the first (index 0) element of an array that was empty.
EDIT
To remove the [{}] as per your comment:
data_norm.locations = data_norm.locations.apply(lambda x:
x[0].get('name', '')
if len(x) > 0 else ''
)
Upvotes: 2