Reputation: 35
I have to create a nested dictionary from an excel workbook. I am using openpyxl library. Excel file looks something like that:
| score | lat | lon | loc.country| loc.city | loc.street | loc.st_nr | ...
| ---------- | ----------- | ------------ | ---------- | ----------- | ------------ | ------------ | ...
| 2 | 51 | 19 | Poland | Warsaw | Cicha | 1 | ...
| 4 | 52 | 18 | Poland | Cracow | Dluga | 2 | ...
| ... | ... | ... | ... | ... | ... | ... | ...
And I want to achieve something like that:
dict = {
"score": 2,
"lat": 51,
"lon": 19,
"loc": {
"country": "Poland",
"city": "Warsaw",
"street": "Cicha",
"st_nr": 1
}
}
What I've done so far is getting the keys list from header, values list from row and zipping them together :
...
wb = load_workbook(file_obj)
worksheet = wb.active
rows = worksheet.iter_rows(values_only=True)
header = next(rows)
for row in rows:
values = row
order = dict(zip(header, row))
...
But it doesn't make a nested dict. What I achieve is:
dict = {
"score": 2,
"lat": 51,
"lon": 19,
"loc.country": "Poland",
"loc.city": "Warsaw",
"loc.street": "Cicha",
"loc.st_nr": 1
}
How can I modify it to get the expected result?
Upvotes: 0
Views: 250
Reputation: 2894
You can't just do it with dict()
as this function just makes a flat dictionary with your keys and values. If you want to step down a level for variable names with a dot in them, you'll have to use a custom function.
The function below will nest all variables with a dot in them if you pass a list of variable names and a list of values.
def nest_dict(keys,values):
d = {}
for i in range(len(keys)):
if '.' in keys[i]:
l1,l2 = keys[i].split('.')[0],''.join(keys[i].split('.')[1:])
try:
d[l1][l2]=values[i]
except:
d[l1]={l2:values[i]}
else:
d[keys[i]]=values[i]
return d
In the case of your data, you would pass the header and row like that:
header = ["score","lat","lon","loc.country","loc.city","loc.street","loc.st_nr"]
row = [2,51,19,"Poland","Warsaw","Cicha",1]
print(nest_dict(header,row))
Which returns the dictionary
{'score': 2,
'lat': 51,
'lon': 19,
'loc': {'country': 'Poland',
'city': 'Warsaw',
'street': 'Cicha',
'st_nr': 1}
}
Note, that this will only work for one level. If you have variable names with multiple dots that have to go one level deeper, you'd have to adjust the function.
Upvotes: 1
Reputation: 632
Not particularly the best option but wasnt bothered to install the package xD
Input:
mydict = {
"score": 2,
"lat": 51,
"lon": 19,
"loc.country": "Poland",
"loc.city": "Warsaw",
"loc.street": "Cicha",
"loc.st_nr": 1
}
Function
mydict = {}
for key in dict.keys():
key = key.split('.')
if key.__len__() == 2:
if key[0] not in mydict.keys():
mydict[key[0]] = {}
if key[1] not in mydict.keys():
mydict[key[0]][key[1]] = dict[key[0] + "." + key[1]]
else:
mydict[key[0]] = dict[key[0]]
print(mydict)
Output:
{'score': 2, 'lat': 51, 'lon': 19, 'loc': {'country': 'Poland', 'city': 'Warsaw', 'street': 'Cicha', 'st_nr': 1}}
Upvotes: 2