Reputation: 107
I have a column in a dataframe as follows:
Data
[special_request=nowhiterice, waiter=Janice]
[allegic=no, waiter=Janice, tip=20]
[allergic=no, tip=20]
[special_request=nogreens]
May I know how could I make it such that one data = 1 column ?
special_request allegic waiter tip
Upvotes: 1
Views: 149
Reputation: 12701
You can make a Dictionary by splitting the elements of your series and build your Dataframe from it (s
being your column here):
import pandas as pd
s = pd.Series([['special_request=nowhiterice', 'waiter=Janice'],
['allegic=no', 'waiter=Janice', 'tip=20'],
['allergic=no', 'tip=20'],
['special_request=nogreens']])
df = pd.DataFrame([dict(e.split('=') for e in row) for row in s])
print(df)
Output:
special_request waiter allegic tip allergic
0 nowhiterice Janice NaN NaN NaN
1 NaN Janice no 20 NaN
2 NaN NaN NaN 20 no
3 nogreens NaN NaN NaN NaN
Edit: if the column values are actual strings, you first should split your string (also stripping [
, ]
and whitespaces):
s = pd.Series(['[special_request=nowhiterice, waiter=Janice]',
'[allegic=no, waiter=Janice, tip=20]',
'[allergic=no, tip=20]',
'[special_request=nogreens]'])
df = pd.DataFrame([dict(map(str.strip, e.split('=')) for e in row.strip('[]').split(',')) for row in s])
print(df)
Upvotes: 1
Reputation: 29982
You can split the column value of string type into dict then use pd.json_normalize
to convert dict to columns.
df_ = pd.json_normalize(df['Data'].apply(lambda x: dict([map(str.strip, i.split('=')) for i in x.strip("[]").split(',')])))
print(df_)
special_request waiter allegic tip allergic
0 nowhiterice Janice NaN NaN NaN
1 NaN Janice no 20 NaN
2 NaN NaN NaN 20 no
3 nogreens NaN NaN NaN NaN
Upvotes: 0