Mireczko
Mireczko

Reputation: 35

Making nested dictionary from an excel workbook

I have to create a nested dictionary from an excel workbook. I am using openpyxl library. Excel file looks something like that:

|   score    |     lat     |      lon     | loc.country|  loc.city   |  loc.street  |   loc.st_nr  | ...
| ---------- | ----------- | ------------ | ---------- | ----------- | ------------ | ------------ | ...
|     2      |     51      |      19      |  Poland    |   Warsaw    |    Cicha     |      1       | ...
|     4      |     52      |      18      |  Poland    |   Cracow    |    Dluga     |      2       | ...
|    ...     |     ...     |     ...      |    ...     |     ...     |     ...      |     ...      | ...

And I want to achieve something like that:

dict = {
    "score": 2,
    "lat": 51,
    "lon": 19,
    "loc": {
        "country": "Poland",
        "city": "Warsaw",
        "street": "Cicha",
        "st_nr": 1
    }
}

What I've done so far is getting the keys list from header, values list from row and zipping them together :

...
wb = load_workbook(file_obj)
worksheet = wb.active
rows = worksheet.iter_rows(values_only=True)
header = next(rows)
for row in rows:
    values = row
    order = dict(zip(header, row))
...

But it doesn't make a nested dict. What I achieve is:

    dict = {
        "score": 2,
        "lat": 51,
        "lon": 19,
        "loc.country": "Poland",
        "loc.city": "Warsaw",
        "loc.street": "Cicha",
        "loc.st_nr": 1
    }

How can I modify it to get the expected result?

Upvotes: 0

Views: 250

Answers (2)

Martin Wettstein
Martin Wettstein

Reputation: 2894

You can't just do it with dict() as this function just makes a flat dictionary with your keys and values. If you want to step down a level for variable names with a dot in them, you'll have to use a custom function.

The function below will nest all variables with a dot in them if you pass a list of variable names and a list of values.

def nest_dict(keys,values):
    d = {}
    for i in range(len(keys)):
        if '.' in keys[i]:
            l1,l2 = keys[i].split('.')[0],''.join(keys[i].split('.')[1:])
            try:
                d[l1][l2]=values[i]
            except:
                d[l1]={l2:values[i]}
        else:
            d[keys[i]]=values[i]
    return d

In the case of your data, you would pass the header and row like that:

header = ["score","lat","lon","loc.country","loc.city","loc.street","loc.st_nr"]
row = [2,51,19,"Poland","Warsaw","Cicha",1]

print(nest_dict(header,row))

Which returns the dictionary

{'score': 2, 
 'lat': 51, 
 'lon': 19, 
 'loc': {'country': 'Poland', 
         'city': 'Warsaw', 
         'street': 'Cicha', 
         'st_nr': 1}
}

Note, that this will only work for one level. If you have variable names with multiple dots that have to go one level deeper, you'd have to adjust the function.

Upvotes: 1

Noah
Noah

Reputation: 632

Not particularly the best option but wasnt bothered to install the package xD

Input:

mydict = {
    "score": 2,
    "lat": 51,
    "lon": 19,
    "loc.country": "Poland",
    "loc.city": "Warsaw",
    "loc.street": "Cicha",
    "loc.st_nr": 1
}

Function

mydict = {}
for key in dict.keys():
    key = key.split('.')
    if key.__len__() == 2:
        if key[0] not in mydict.keys():
            mydict[key[0]] = {}
        if key[1] not in mydict.keys():
            mydict[key[0]][key[1]] = dict[key[0] + "." + key[1]]
    else:
        mydict[key[0]] = dict[key[0]]

print(mydict)

Output:

{'score': 2, 'lat': 51, 'lon': 19, 'loc': {'country': 'Poland', 'city': 'Warsaw', 'street': 'Cicha', 'st_nr': 1}}

Upvotes: 2

Related Questions