Reputation:
I am new to python and I am having a dict. I would like to find out the maximum valued fields from the dict like for index 0 and 1 there is a common value in the dict i.e 1. So I would like to identify the max value which is 0.8 and to be pointed out.
0: ['1', 'Metrolink', 0.7054569125175476],
1: ['1', 'Toronto', 0.8],
Like wise I would like to do the same for all other values.
This is my complete dict.
d={
0: ['1', 'Metrolink', 0.7054569125175476],
1: ['1', 'Toronto', 0.8],
4: ['2', 'Residence Inn Bentonville', 0.721284806728363],
5: ['2', 'Bentonville, Arkansas', 0.8],
7: ['2', 'Rogers', 0.5609406232833862],
8: ['2', 'Toronto', 0.8],
10: ['2', 'Arkansas', 0.8871413469314575],
12: ['2', 'CA', 0.5339972972869873],
14: ['3', 'Toronto', 0.8],
19: ['3', 'ik', 0.555569052696228],
21: ['4', 'DL', 0.47785162925720215],
22: ['4', 'MS', 0.5182732939720154],
23: ['4', 'Nashville International Airport', 0.8],
27: ['4', 'Turkey', 0.8],
30: ['5', 'Hebron, Kentucky', 0.8],
32: ['5', 'OAK PARK', 0.6157999038696289],
35: ['5', 'USA', 0.5055036544799805],
36: ['5', 'Tennessee', 0.5752009153366089],
37: ['5', 'Recov', 0.6585434675216675],
38: ['5', 'County (United States)', 0.8],
40: ['6', 'SFO', 0.6019220948219299],
42: ['6', 'Ontario', 0.8],
45: ['7', 'United States', 0.6973987221717834],
47: ['7', 'Buckingham Gate', 0.8],
48: ['7', 'London', 0.9545853137969971],
53: ['8', 'Phoenix, Arizona', 0.8],
55: ['8', 'STE', 0.5046005249023438],
56: ['8', 'TULSA', 0.7144339680671692],
58: ['8', 'UNITED STATES OF AMERICA', 0.8454625606536865],
60: ['9', 'RDU', 0.6373313069343567],
61: ['9', 'Raleigh–Durham International Airport', 0.8],
65: ['9', 'Piauí', 0.8],
69: ['9', 'CAR', 0.6243148446083069],
71: ['10', 'MONMOUTH JUNCTION', 0.7259661555290222],
72: ['10', 'New Jersey', 0.8],
76: ['10', 'PVK', 0.6593300104141235],
79: ['10', 'TWW', 0.6495188474655151],
81: ['10', 'Morrisville, Bucks County, Pennsylvania', 0.8],
84: ['10', 'United States', 0.8],
88: ['10', 'New Brunswick, New Jersey', 0.8]
Upvotes: 1
Views: 219
Reputation: 33770
Pandas is very effective tool for handling tabular data like this. You could create a pandas DataFrame from the data:
import pandas as pd
df = pd.DataFrame(d).T
df.columns = ('group', 'place', 'value')
and then just print out the maximum values
df[df['value'] == df.groupby('group')['value'].transform('max')]
which gives
Out[41]:
group place value
1 1 Toronto 0.8
10 2 Arkansas 0.887141
14 3 Toronto 0.8
23 4 Nashville International Airport 0.8
27 4 Turkey 0.8
30 5 Hebron, Kentucky 0.8
38 5 County (United States) 0.8
42 6 Ontario 0.8
48 7 London 0.954585
58 8 UNITED STATES OF AMERICA 0.845463
61 9 RaleighDurham International Airport 0.8
65 9 Piauí 0.8
72 10 New Jersey 0.8
81 10 Morrisville, Bucks County, Pennsylvania 0.8
84 10 United States 0.8
88 10 New Brunswick, New Jersey 0.8
If you want to get the output in the original format, you can use df.to_dict
In [47]: df[df['value'] == df.groupby('group')['value'].transform('max')].T.to_dict(orient='list')
Out[47]:
{1: ['1', 'Toronto', 0.8],
10: ['2', 'Arkansas', 0.8871413469314575],
14: ['3', 'Toronto', 0.8],
23: ['4', 'Nashville International Airport', 0.8],
27: ['4', 'Turkey', 0.8],
30: ['5', 'Hebron, Kentucky', 0.8],
38: ['5', 'County (United States)', 0.8],
42: ['6', 'Ontario', 0.8],
48: ['7', 'London', 0.9545853137969971],
58: ['8', 'UNITED STATES OF AMERICA', 0.8454625606536865],
61: ['9', 'RaleighDurham International Airport', 0.8],
65: ['9', 'Piauí', 0.8],
72: ['10', 'New Jersey', 0.8],
81: ['10', 'Morrisville, Bucks County, Pennsylvania', 0.8],
84: ['10', 'United States', 0.8],
88: ['10', 'New Brunswick, New Jersey', 0.8]}
.T
takes just transpose of the table.df.groupby('group')['value']
returns a SeriesGroupBy object, which behaves very much like a regular pandas.Series object. With that we can calculate the maximum value
for each group
, using the transform
method.df['value'] == df.groupby('group')['value'].transform('max')
creates a boolean mask for selecting the maximum rows by df[mask]
.Upvotes: 2
Reputation: 366
you could get sorted dictionary with following code:
dict(sorted(d.items(), key=lambda kv:(int(kv[1][0]), kv[1][2])))
if you want to sort based on first element and second element, you cloud ues:
dict(sorted(d.items(), key=lambda kv:(int(kv[1][0]), kv[1][1])))
Upvotes: 0
Reputation: 5877
It sounds like you want to get the maximum value across each sub-key (the first item of each entry's value). To do that, you can use this:
from collections import defaultdict
max_values = defaultdict(lambda: (float('-inf'), None))
for label, text, value in d.values():
max_values[label] = max(max_values[label], (value, text))
Using defaultdict
here with a default value of (float('-inf'), None)
allows us to compare new max values to old new values without having to check if a max value was recorded in the first place.
max_values
ends up as:
{
'1': (0.8, 'Toronto'),
'2': (0.8871413469314575, 'Arkansas'),
'3': (0.8, 'Toronto'),
'4': (0.8, 'Turkey'),
'5': (0.8, 'Hebron, Kentucky'),
'6': (0.8, 'Ontario'),
'7': (0.9545853137969971, 'London'),
'8': (0.8454625606536865, 'UNITED STATES OF AMERICA'),
'9': (0.8, 'Raleigh–Durham International Airport'),
'10': (0.8, 'United States')
}
Upvotes: 0