Reputation: 3
I have the following dataframe:
X Y Z A B
1 7.9 0.2 0.6 0.3 0.75
2 8.9 0.2 0.1 0.8 0.60
3 8.9 0.2 0.9 0.0 0.45
4 4.3 0.3 2.3 0.3 0.35
5 7.8 0.3 2.0 0.6 0.36
6 8.9 0.3 1.3 0.4 0.34
7 1.2 3.4 4.0 3.0 3.50
8 7.8 9.0 0.0 9.0 9.50
The dataframe is grouped using the 'Y' column. In each group, I would like to get the average value for 'X' and max for 'Z', 'A', and 'B'.
The result should look like this:
X Y Z A B
0 8.56 0.2 0.9 0.8 0.75
1 7.00 0.3 2.3 0.6 0.36
2 1.20 3.4 4.0 3.0 3.50
3 7.80 9.0 0.0 9.0 9.50
Upvotes: 0
Views: 60
Reputation: 5802
Assuming you're talking about a pandas dataframe you can use df.agg
, specifying an aggregation function for each column.
import pandas as pd
df = pd.DataFrame({'X': [7.9, 8.9, 8.9, 4.3, 7.8, 8.9, 1.2, 7.8],
'Y': [0.2, 0.2, 0.2, 0.3, 0.3, 0.3, 3.4, 9.0],
'Z': [0.6, 0.1, 0.9, 2.3, 2.0, 1.3, 4.0, 0.0],
'A': [0.3, 0.8, 0.0, 0.3, 0.6, 0.4, 3.0, 9.0],
'B': [0.75, 0.6, 0.45, 0.35, 0.36, 0.34, 3.5, 9.5]}))
df2 = df.groupby("Y", as_index=False).agg({"X":"mean",
"Z":"max",
"A":"max",
"B":"max"})
gives
>>> df2
Y X Z A B
0 0.2 8.566667 0.9 0.8 0.75
1 0.3 7.000000 2.3 0.6 0.36
2 3.4 1.200000 4.0 3.0 3.50
3 9.0 7.800000 0.0 9.0 9.50
Upvotes: 2
Reputation: 117
If I got your question right, then the main thing here is how you are storing the data. The first way on my mind (maybe not the best one) is dictionary of dictionaries
data = Dict("Y-value1" ==> Dict(\
"X" ==> [1, 2, 3]\
"Z" ==> [4, 5, 6]\
"A" ==> [7, 8, 9]\
"B" ==> [10, 11, 12])\
"Y-value2" ==> Dict(\
"X" ==> [1, 2, 3]\
"Z" ==> [4, 5, 6]\
"A" ==> [7, 8, 9]\
"B" ==> [10, 11, 12])) # And so on
Then you could just use those standard functions for finding max, min and average (or write them by yourself)
Upvotes: 0
Reputation: 459
For getting the max values, you can try something like this:
max_value = max(df['column_name'].values)
A similar approach for getting the average as well.
Upvotes: 0