Reputation: 1527
I have a array of arrays and I want to get the max number by id. In the next example column 2 represent the id and the column 4 the value. When id = 1 the max value is 308.45. When id = 2 the max value is 310.508474.
input:
[['X', '1', '0', '303.016666'],
['X1', '1', '1', '305.516666'],
['X2', '1', '2', '308.45'],
['X3', '2', '0', '309.409836'],
['X4', '2', '1', '310.508474'],
['X5', '2', '2', '308.728813']]
output:
[['X2', '1', '2', '308.45'],
['X4', '2', '1', '310.508474']]
How can I do that ?
Upvotes: 2
Views: 454
Reputation: 294488
using pandas
import pandas as pd
df = pd.DataFrame([
['X', 1, 0, 303.016666],
['X1', 1, 1, 305.516666],
['X2', 1, 2, 308.45],
['X3', 2, 0, 309.409836],
['X4', 2, 1, 310.508474],
['X5', 2, 2, 308.728813]]
)
print(df.values[df.groupby(1)[3].idxmax()])
[['X2' 1 2 308.45]
['X4' 2 1 310.508474]]
pure numpy
a = np.array([
['X', 1, 0, 303.016666],
['X1', 1, 1, 305.516666],
['X2', 1, 2, 308.45],
['X3', 2, 0, 309.409836],
['X4', 2, 1, 310.508474],
['X5', 2, 2, 308.728813]
], dtype=object)
ids = np.unique(a[:, 1])
grp = np.where(ids == a[:, [1]], 1, np.nan)
expanded_value_column = grp * a[:, [3]].astype(float)
max_positions = np.nanargmax(expanded_value_column, axis=0)
print(a[max_positions])
[['X2' 1 2 308.45]
['X4' 2 1 310.508474]]
Upvotes: 5
Reputation: 48090
You can write the dict comprehension expression along with the usage of set()
for storing unique id as:
my_data = [
['X', '1', '0', '303.016666'],
['X1', '1', '1', '305.516666'],
['X2', '1', '2', '308.45'],
['X3', '2', '0', '309.409836'],
['X4', '2', '1', '310.508474'],
['X5', '2', '2', '308.728813']]
# Unique ids
my_id = set([data[1] for data in my_data])
my_max = {id: max([val for _, i, _, val in my_data if i==id]) for id in my_id}
# Content of 'my_max': {'1': '308.45', '2': '310.508474'}
Upvotes: 0
Reputation: 2441
The simplest and most intuitive solution I can imagine:
>>> l = [['X', '1', '0', '303.016666'],
... ['X1', '1', '1', '305.516666'],
... ['X2', '1', '2', '308.45'],
... ['X3', '2', '0', '309.409836'],
... ['X4', '2', '1', '310.508474'],
... ['X5', '2', '2', '308.728813']]
>>> result = {}
>>> for a, b, c, d in l:
... if b not in result or float(d) > float(result[b][2]):
... result[b] = (a, c, d)
...
>>> result
{'1': ('X2', '2', '308.45'), '2': ('X4', '1', '310.508474')}
>>> result = [(a, b, c, d) for b, (a, c, d) in result.items()]
>>> result
[('X2', '1', '2', '308.45'), ('X4', '2', '1', '310.508474')]
Upvotes: 2