Reputation: 75
I have the DataFrame below with 2 columns, and one of the columns is a list of dictionary inside an list of dictionary. I would like to split/separate this column in several columns.
import pandas as pd
USERNAME = ['root', 'user1', 'user2','user3']
test_data = '[{"conjunction":"and","expressions":[{"_actualOperator":"contains","_actualValue":"LBD","attr":"displayName","op":"contains","value":"LBD"}],"name":"test_Event","editable":true}]'
test_data2 = '[{"conjunction":"and","expressions":[{"_actualOperator":"not_contains","_actualValue":"AAA","attr":"Event","op":"contains","value":"LBD"}],"name":"test_Event","editable":true}]'
test_data3 = '[{"conjunction":"and","expressions":[{"_actualOperator":"exclude","_actualValue":"BBB","attr":"Event","op":"contains","value":"LBD"}],"name":"test_Event","editable":true}]'
test_data4 = '[{"conjunction":"and","expressions":[{"_actualOperator":"adding","_actualValue":"CASA","attr":"displayName","op":"contains","value":"LBD"}],"name":"test_Event","editable":true}]'
VALUE_STRING = [test_data, test_data2, test_data3, test_data4]
data = {'USERNAME': ['root', 'user1', 'user2','user3'], 'VALUE_STRING' : VALUE_STRING}
df = pd.DataFrame(data)
df
USERNAME VALUE_STRING
root [{"conjunction":"and","expressions":[{"_actual...
user1 [{"conjunction":"and","expressions":[{"_actual...
user2 [{"conjunction":"and","expressions":[{"_actual...
user2 [{"conjunction":"and","expressions":[{"_actual...
And I expected a result like this:
df_expected = pd.DataFrame({'USERNAME': ['root', 'user1', 'user2','user3'],
'_actualOperator':['contains','not_contains','exclude','adding'],
'_actualValue':['LBD','AAA','BBB','CASA'],
'attr':['displayName','Event','Event','displayName']})
df_expected
USERNAME _actualOperator _actualValue attr
root contains LBD displayName
user1 not_contains AAA Event
user2 exclude BBB Event
user3 adding CASA displayName
Upvotes: 0
Views: 81
Reputation: 71689
It seems like the column VALUE_STRING
contains json data, in that case we can parse the json data using loads
method of json
module, then extract the dictionaries associated with the key expressions
from each row, create a new dataframe from these dictionaries and join back with USERNAME
column
import json
s = df['VALUE_STRING'].map(json.loads)\
.str[0].str['expressions'].str[0]
exp = pd.DataFrame([*s], index=s.index)
df_out = df[['USERNAME']].join(exp).drop(['op', 'value'], axis=1)
Alternative approach with pandas json_normalize
method
s = df['VALUE_STRING'].map(json.loads).str[0]
exp = pd.json_normalize(s, 'expressions')
df_out = df[['USERNAME']].join(exp).drop(['op', 'value'], axis=1)
print(df_out)
USERNAME _actualOperator _actualValue attr
0 root contains LBD displayName
1 user1 not_contains AAA Event
2 user2 exclude BBB Event
3 user3 adding CASA displayName
Upvotes: 1