Provisional.Modulation
Provisional.Modulation

Reputation: 724

Apply function resulted in list index out of range

I'm trying to modify an entire column of values, but I keep having issues with the list being out of range. This is my entire code:

# Libraries
import json, requests
import pandas as pd
from pandas.io.json import json_normalize

# Set URL
url = 'https://api-v2.themuse.com/jobs'

# For loop to extract data
for i in range(100):
    data = json.loads(requests.get(
        url=url,
        params={'page': i}
    ).text)['results']

# JSON to PANDAS
data_norm = pd.read_json(json.dumps(data))

# Modify two columns' values
data_norm.locations = data_norm.locations.apply(lambda x: [{x[0]['name']}])
data_norm.publication_date = pd.to_datetime(data_norm.publication_date)

The problem here is that when I use the function

data_norm.locations = data_norm.locations.apply(lambda x: [{x[0]['name']}]) 

I get the following error:

IndexError: list index out of range

Ideally, I want to change location column from this:

0               [{'name': 'Seattle, WA'}]
1    [{'name': 'San Francisco Bay Area'}]
2             [{'name': 'Palo Alto, CA'}]
3                  [{'name': 'Reno, NV'}]
4                                      []
Name: locations, dtype: object

into this:

0                     Seattle, WA
1          San Francisco Bay Area
2                   Palo Alto, CA
3                        Reno, NV
4                                      
Name: locations, dtype: object

Upvotes: 2

Views: 4291

Answers (1)

scomes
scomes

Reputation: 1846

data_norm.locations = data_norm.locations.apply(lambda x:
                                                [{x[0].get('name', '')}] 
                                                if len(x) > 0 else []
                                                )

Note that this assumes that if that entry contains at least one element, the first element is a dictionary. The issue with your code is that you tried to access the first (index 0) element of an array that was empty.

EDIT

To remove the [{}] as per your comment:

data_norm.locations = data_norm.locations.apply(lambda x:
                                                x[0].get('name', '') 
                                                if len(x) > 0 else ''
                                                )

Upvotes: 2

Related Questions