Making nested dictionary from an excel workbook

Question

I have to create a nested dictionary from an excel workbook. I am using openpyxl library. Excel file looks something like that:

|   score    |     lat     |      lon     | loc.country|  loc.city   |  loc.street  |   loc.st_nr  | ...
| ---------- | ----------- | ------------ | ---------- | ----------- | ------------ | ------------ | ...
|     2      |     51      |      19      |  Poland    |   Warsaw    |    Cicha     |      1       | ...
|     4      |     52      |      18      |  Poland    |   Cracow    |    Dluga     |      2       | ...
|    ...     |     ...     |     ...      |    ...     |     ...     |     ...      |     ...      | ...

And I want to achieve something like that:

dict = {
    "score": 2,
    "lat": 51,
    "lon": 19,
    "loc": {
        "country": "Poland",
        "city": "Warsaw",
        "street": "Cicha",
        "st_nr": 1
    }
}

What I've done so far is getting the keys list from header, values list from row and zipping them together :

...
wb = load_workbook(file_obj)
worksheet = wb.active
rows = worksheet.iter_rows(values_only=True)
header = next(rows)
for row in rows:
    values = row
    order = dict(zip(header, row))
...

But it doesn't make a nested dict. What I achieve is:

    dict = {
        "score": 2,
        "lat": 51,
        "lon": 19,
        "loc.country": "Poland",
        "loc.city": "Warsaw",
        "loc.street": "Cicha",
        "loc.st_nr": 1
    }

How can I modify it to get the expected result?

Martin Wettstein · Accepted Answer

You can't just do it with dict() as this function just makes a flat dictionary with your keys and values. If you want to step down a level for variable names with a dot in them, you'll have to use a custom function.

The function below will nest all variables with a dot in them if you pass a list of variable names and a list of values.

def nest_dict(keys,values):
    d = {}
    for i in range(len(keys)):
        if '.' in keys[i]:
            l1,l2 = keys[i].split('.')[0],''.join(keys[i].split('.')[1:])
            try:
                d[l1][l2]=values[i]
            except:
                d[l1]={l2:values[i]}
        else:
            d[keys[i]]=values[i]
    return d

In the case of your data, you would pass the header and row like that:

header = ["score","lat","lon","loc.country","loc.city","loc.street","loc.st_nr"]
row = [2,51,19,"Poland","Warsaw","Cicha",1]

print(nest_dict(header,row))

Which returns the dictionary

{'score': 2, 
 'lat': 51, 
 'lon': 19, 
 'loc': {'country': 'Poland', 
         'city': 'Warsaw', 
         'street': 'Cicha', 
         'st_nr': 1}
}

Note, that this will only work for one level. If you have variable names with multiple dots that have to go one level deeper, you'd have to adjust the function.

Making nested dictionary from an excel workbook

Answers (2)

Related Questions