Reputation: 354
So I'm working on a movie genre data set and the dataset has all the genres in a single column but I want to split them.
here's how the data set looks like:
genres
----------------------------------------------
[{'id': 16, 'name': 'Animation'}, {'id': 35, 'name': 'Comedy'}, {'id': 10751, 'name': 'Family'}]
[{'id': 35, 'name': 'Comedy'}, {'id': 10749, 'name': 'Romance'}]
[{'id': 35, 'name': 'Comedy'}, {'id': 18, 'name': 'Drama'}, {'id': 10749, 'name': 'Romance'}]
[{'id': 35, 'name': 'Comedy'}]
[{'id': 28, 'name': 'Action'}, {'id': 80, 'name': 'Crime'}, {'id': 18, 'name': 'Drama'}, {'id': 53, 'name': 'Thriller'}]
So what I want to do is get only the first genre so the new column should look like:
genres
_____________
Animation
Comedy
Comedy
Comedy
Action
I hope this is clear enough to understand my problem.
Upvotes: 0
Views: 337
Reputation: 30940
Use DataFrame.apply
.
The first dictionary in the list is selected in each cell. From that dictionary the name
field is selected:
df['genres']=df['genres'].apply(lambda x: x[0]['name'])
print(df)
ID genres
0 0 Animation
1 1 Comedy
2 2 Comedy
3 3 Comedy
4 4 Action
or
df['genres']=df['genres'].apply(lambda x: eval(x)[0]['name'])
TRY THIS
def decode_str_dict(x):
try:
out=eval(x)[0]['name']
except Exception:
try:
out=eval(x)['name']
except Exception:
try:
out=eval(x)
except Exception:
out=x
return out
df['genres'].apply(decode_str_dict)
Upvotes: 5
Reputation: 1813
This works if the values are considered a string.
from ast import literal_eval
df['genres'] = df.genres.map(lambda x: literal_eval(x)[0]['name'])
Result:
Out[294]:
ID genres
1 0 Animation
2 1 Comedy
3 2 Comedy
4 3 Comedy
5 4 Action
Upvotes: 3
Reputation: 2451
df['genres'] = df['genres'].map(lambda x:[i['name'] for i in x])
df['first_genre'] = df['genres'][0]
df = df[['name','first_genre']]
Upvotes: 3