Extracting value for one dictionary key in Pandas based on another in the same dictionary

Question

This is from an R guy.

I have this mess in a Pandas column: data['crew'].

array(["[{'credit_id': '54d5356ec3a3683ba0000039', 'department': 'Production', 'gender': 1, 'id': 494, 'job': 'Casting', 'name': 'Terri Taylor', 'profile_path': None}, {'credit_id': '56407fa89251417055000b58', 'department': 'Sound', 'gender': 0, 'id': 6745, 'job': 'Music Editor', 'name': 'Richard Henderson', 'profile_path': None}, {'credit_id': '5789212392514135d60025fd', 'department': 'Production', 'gender': 2, 'id': 9250, 'job': 'Executive In Charge Of Production', 'name': 'Jeffrey Stott', 'profile_path': None}, {'credit_id': '57892074c3a36835fa002886', 'department': 'Costume & Make-Up', 'gender': 0, 'id': 23783, 'job': 'Makeup Artist', 'name': 'Heather Plott', 'profile_path': None}

It goes on for quite some time. Each new dict starts with a credit_id field. One sell can hold several dicts in an array.

Assume I want the names of all Casting directors, as shown in the first entry. I need to check check the job entry in every dict and, if it's Casting, grab what's in the name field and store it in my data frame in data['crew'].

I tried several strategies, then backed off and went for something simple. Running the following shut me down, so I can't even access a simple field. How can I get this done in Pandas.

for row in data.head().iterrows():
    if row['crew'].job == 'Casting':
        print(row['crew'])

EDIT: Error Message

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
 in ()
      1 for row in data.head().iterrows():
----> 2     if row['crew'].job == 'Casting':
      3         print(row['crew'])

TypeError: tuple indices must be integers or slices, not str

EDIT: Code used to get the array of dict (strings?) in the first place.

def convert_JSON(data_as_string):
    try:
        dict_representation = ast.literal_eval(data_as_string)
        return dict_representation
    except ValueError:
        return []

data["crew"] = data["crew"].map(lambda x: sorted([d['name'] if d['job'] == 'Casting' else '' for d in convert_JSON(x)])).map(lambda x: ','.join(map(str, x))

Extracting value for one dictionary key in Pandas based on another in the same dictionary

Answers (1)

Edit

Related Questions