Reputation: 173
I have a dataframe that looks like this:
name val
0 cat ['Furry: yes', 'Fast: yes', 'Slimy: no', 'Living: yes']
1 dog ['Furry: yes', 'Fast: yes', 'Slimy: no', 'Living: yes']
2 snail ['Furry: no', 'Fast: no', 'Slimy: yes', 'Living: yes']
3 paper ['Furry: no', 'Fast: no', 'Slimy: no', 'Living: no']
For each item in list in the val column, I want to split the item on the ':' delimiter. Then I want to make item[0] be the column name, and item[1] be the value for that specific column. Like so:
name Furry Fast Slimy Living
0 cat yes yes no yes
1 dog yes yes no yes
2 snail no no yes yes
3 paper no no no no
I've tried using apply(pd.Series) to the val column, but that still leaves me with many columns that I'd have to either manually do splits on, or figure out how to iteratively go through all the columns and do splits. I prefer to split from ground zero and create the column names. Any idea how I can achieve this?
Upvotes: 3
Views: 2229
Reputation: 51155
apply
with split
to create dictionary:
df.val = df.val.apply(lambda x: dict([i.split(': ') for i in x]))
apply
with pd.Series
to create columns:
df.join(df.val.apply(pd.Series)).drop('val', 1)
name Furry Fast Slimy Living
0 cat yes yes no yes
1 dog yes yes no yes
2 snail no no yes yes
3 paper no no no no
Upvotes: 2
Reputation: 164683
pd.DataFrame
accepts a list of dictionaries directly. Therefore, you can construct a dataframe via a list comprehension and then join.
L = [dict(i.split(': ') for i in x) for x in df['val']]
df = df[['name']].join(pd.DataFrame(L))
print(df)
name Fast Furry Living Slimy
0 cat yes yes yes no
1 dog yes yes yes no
2 snail no no yes yes
3 paper no no no no
Upvotes: 4