Reputation: 43
I have the following dict with a nested dict "Emotions":
I am trying to find an easy way to return the top 2 Emotion "Type" with largest 2 "Confidence" values ( in the case of this dict, it's "CONFUSED" & "ANGRY"
[
{
"AgeRange": {
"High": 52,
"Low": 36
},
"Emotions": [
{
"Confidence": 22.537073135375977,
"Type": "ANGRY"
},
{
"Confidence": 1.3983955383300781,
"Type": "SAD"
},
{
"Confidence": 1.2260702848434448,
"Type": "DISGUSTED"
},
{
"Confidence": 2.291703939437866,
"Type": "FEAR"
},
{
"Confidence": 8.114240646362305,
"Type": "HAPPY"
},
{
"Confidence": 10.546235084533691,
"Type": "SURPRISED"
},
{
"Confidence": 18.409439086914062,
"Type": "CALM"
},
{
"Confidence": 35.47684097290039,
"Type": "CONFUSED"
}
],
}
]
i have tried things like dictmax = max(dict[Emotions][Confidence] key=dict.get)
but that doesnt seem to work, and i am at a loss. I feel like there should be an easy way to retrieve just the Type, based upon the value of Confidence.
Upvotes: 2
Views: 217
Reputation: 2905
Ch3steR's answer works, but I'd like to propose a solution with pandas
, which is a library for dealing with data analysis (using DataFrame objects, which allow for easy data manipulation).
In your example, let's just take the relevant part of the example:
emotions = [{'Confidence': 22.537073135375977, 'Type': 'ANGRY'},
{'Confidence': 1.3983955383300781, 'Type': 'SAD'},
{'Confidence': 1.2260702848434448, 'Type': 'DISGUSTED'},
{'Confidence': 2.291703939437866, 'Type': 'FEAR'},
{'Confidence': 8.114240646362305, 'Type': 'HAPPY'},
{'Confidence': 10.546235084533691, 'Type': 'SURPRISED'},
{'Confidence': 18.409439086914062, 'Type': 'CALM'},
{'Confidence': 35.47684097290039, 'Type': 'CONFUSED'}]
This can be cast into a pandas DataFrame object:
import pandas as pd
pd.DataFrame(emotions)
yields
Confidence Type
0 22.537073 ANGRY
1 1.398396 SAD
2 1.226070 DISGUSTED
3 2.291704 FEAR
4 8.114241 HAPPY
5 10.546235 SURPRISED
6 18.409439 CALM
7 35.476841 CONFUSED
This object can be sorted by any column (e.g. Confidence), with the .sort_values
method, the last two (or any other number) of rows can be selected with the .tail(2)
method, and finally, the 'Type' column can be selected. To sup it up:
pd.DataFrame(emotions).sort_values('Confidence').tail(2)['Type'].values
yields
array(['ANGRY', 'CONFUSED'], dtype=object)
If you want the top 1, and not top n (for n>1), it's faster and simpler to search for the maximum instead of sorting:
emotions.loc[emotions['Confidence'].idxmax(),'Type']
yields
'CONFUSED'
This is not faster than Ch3steR's answer, but the code is more straight-forward (once you know the pandas library), and easy to "scale up" to more complex data analysis which you might need later on.
Upvotes: 1
Reputation: 20659
You can try this.
for d in my_list:
out=sorted(d['Emotions'],key=lambda x:x['Confidence'],reverse=True)[:2]
[{'Confidence': 35.47684097290039, 'Type': 'CONFUSED'},
{'Confidence': 22.537073135375977, 'Type': 'ANGRY'}]
You can use nlargest
also.
from heapq import nlargest
for d in a:
out=nlargest(2,d['Emotions'],key=lambda x:x['Confidence'])
Upvotes: 2