Reputation: 69
I have a dataframe named layout where it has a column named layout(yes same name) with nested data like this.
[{'floors': [{'floor_id': 1024,
'floor_name': 'Apartment',
'rooms': [{'devices': [1017,
1021,
1032,
1038,
1041,
1048,
1052,
1060,
1016,
1062,
1069,
1048],
'room_id': 1025,
'room_name': 'Living room'},
{'devices': [1018,
1022,
1037,
1047,
1067,
1070,
1073,
1079,
1080,
1081,
1041,
1045,
1047],
'room_id': 1026,
'room_name': 'Kitchen'},
{'devices': [1034,
1036,
1046,
1050,
1049,
1055,
1071,
1074,
1076,
1044,
1046],
'room_id': 1027,
'room_name': 'Bathroom'},
{'devices': [1023, 1033, 1045, 1054, 1075, 1042],
'room_id': 1028,
'room_name': 'Bedroom 1'},
{'devices': [1020, 1051, 1053, 1066, 1068, 1077, 1078, 1043],
'room_id': 1029,
'room_name': 'Bedroom 2'}]},
{'floor_id': 1021,
'floor_name': 'First',
'rooms': [{'room_id': 1023, 'room_name': 'Kitchen', 'devices': [1019]},
{'room_id': 1024, 'room_name': 'Beedroom', 'devices': [1030]},
{'room_id': 1025, 'room_name': 'Store', 'devices': [1035]}]},
{'floor_id': 1026,
'floor_name': 'Ground',
'rooms': [{'room_id': 1027, 'room_name': 'Hall', 'devices': [1032, 1033]},
{'room_id': 1028, 'room_name': 'Gallery', 'devices': [1022, 1034]}]}]}]
Is there a way to put this in a single dataframe? I want something like floor_id and floor_name value should be common for all rows. The problem i'm getting is dealing with the key 'rooms'. I want the dataframe to directly have a column named devices( which is inside rooms ) and contain numbers in devices as rows. Then the last columns will be room_id and room_name
floor_id | floor_name | devices | room_id | room_name |
---|---|---|---|---|
1024 | Apartment | 1017 | 1025 | Living room |
1024 | Apartment | 1021 | 1025 | Living room |
1024 | Apartment | 1032 | 1025 | Living room |
....... | ||||
1024 | Apartment | 1068 | 1029 | Bedroom 2 |
1024 | Apartment | 1077 | 1029 | Bedroom 2 |
1024 | Apartment | 1078 | 1029 | Bedroom 2 |
1024 | Apartment | 1043 | 1029 | Bedroom 2 |
i tried using
pd.DataFrame(layout.iloc[0]['layout'])
but i get
floors 0 [{'floor_id': 1024, 'floor_name': 'Apartment',...
Upvotes: 0
Views: 31
Reputation: 148880
You should explode the nested dictionary into a flat one with a comprehension:
data = layout.iloc[0]['layout']
df = pd.DataFrame([{'floor_id': floor['floor_id'], 'floor_name': floor['floor_name'],
'devices': device, 'room_id': room['room_id'],
'room_name': room['room_name']} for floor in data[0]['floors']
for room in floor['rooms'] for device in room['devices']])
It gives as expected:
floor_id floor_name devices room_id room_name
0 1024 Apartment 1017 1025 Living room
1 1024 Apartment 1021 1025 Living room
2 1024 Apartment 1032 1025 Living room
3 1024 Apartment 1038 1025 Living room
4 1024 Apartment 1041 1025 Living room
5 1024 Apartment 1048 1025 Living room
6 1024 Apartment 1052 1025 Living room
7 1024 Apartment 1060 1025 Living room
8 1024 Apartment 1016 1025 Living room
9 1024 Apartment 1062 1025 Living room
10 1024 Apartment 1069 1025 Living room
11 1024 Apartment 1048 1025 Living room
12 1024 Apartment 1018 1026 Kitchen
13 1024 Apartment 1022 1026 Kitchen
...
Upvotes: 1