killyb
killyb

Reputation: 43

How do i find the largest value of an item in a nested dict in python?

I have the following dict with a nested dict "Emotions":

I am trying to find an easy way to return the top 2 Emotion "Type" with largest 2 "Confidence" values ( in the case of this dict, it's "CONFUSED" & "ANGRY"

[
    {
        "AgeRange": {
            "High": 52,
            "Low": 36
        },
        "Emotions": [
            {
                "Confidence": 22.537073135375977,
                "Type": "ANGRY"
            },
            {
                "Confidence": 1.3983955383300781,
                "Type": "SAD"
            },
            {
                "Confidence": 1.2260702848434448,
                "Type": "DISGUSTED"
            },
            {
                "Confidence": 2.291703939437866,
                "Type": "FEAR"
            },
            {
                "Confidence": 8.114240646362305,
                "Type": "HAPPY"
            },
            {
                "Confidence": 10.546235084533691,
                "Type": "SURPRISED"
            },
            {
                "Confidence": 18.409439086914062,
                "Type": "CALM"
            },
            {
                "Confidence": 35.47684097290039,
                "Type": "CONFUSED"
            }
        ],
    }
]

i have tried things like dictmax = max(dict[Emotions][Confidence] key=dict.get) but that doesnt seem to work, and i am at a loss. I feel like there should be an easy way to retrieve just the Type, based upon the value of Confidence.

Upvotes: 2

Views: 217

Answers (2)

Itamar Mushkin
Itamar Mushkin

Reputation: 2905

Ch3steR's answer works, but I'd like to propose a solution with pandas, which is a library for dealing with data analysis (using DataFrame objects, which allow for easy data manipulation).

In your example, let's just take the relevant part of the example:

emotions = [{'Confidence': 22.537073135375977, 'Type': 'ANGRY'},
 {'Confidence': 1.3983955383300781, 'Type': 'SAD'},
 {'Confidence': 1.2260702848434448, 'Type': 'DISGUSTED'},
 {'Confidence': 2.291703939437866, 'Type': 'FEAR'},
 {'Confidence': 8.114240646362305, 'Type': 'HAPPY'},
 {'Confidence': 10.546235084533691, 'Type': 'SURPRISED'},
 {'Confidence': 18.409439086914062, 'Type': 'CALM'},
 {'Confidence': 35.47684097290039, 'Type': 'CONFUSED'}]

This can be cast into a pandas DataFrame object:

import pandas as pd
pd.DataFrame(emotions)

yields

    Confidence  Type
0   22.537073   ANGRY
1   1.398396    SAD
2   1.226070    DISGUSTED
3   2.291704    FEAR
4   8.114241    HAPPY
5   10.546235   SURPRISED
6   18.409439   CALM
7   35.476841   CONFUSED

This object can be sorted by any column (e.g. Confidence), with the .sort_values method, the last two (or any other number) of rows can be selected with the .tail(2) method, and finally, the 'Type' column can be selected. To sup it up:

pd.DataFrame(emotions).sort_values('Confidence').tail(2)['Type'].values

yields

array(['ANGRY', 'CONFUSED'], dtype=object)

If you want the top 1, and not top n (for n>1), it's faster and simpler to search for the maximum instead of sorting:

emotions.loc[emotions['Confidence'].idxmax(),'Type']

yields

'CONFUSED'

This is not faster than Ch3steR's answer, but the code is more straight-forward (once you know the pandas library), and easy to "scale up" to more complex data analysis which you might need later on.

Upvotes: 1

Ch3steR
Ch3steR

Reputation: 20659

You can try this.

for d in my_list:
    out=sorted(d['Emotions'],key=lambda x:x['Confidence'],reverse=True)[:2]

[{'Confidence': 35.47684097290039, 'Type': 'CONFUSED'},
 {'Confidence': 22.537073135375977, 'Type': 'ANGRY'}]

You can use nlargest also.

from heapq import nlargest
for d in a:
    out=nlargest(2,d['Emotions'],key=lambda x:x['Confidence'])

Upvotes: 2

Related Questions