Reputation: 823
Assume that I have the following dataframe:
+---+---------+------+------+------+
| | summary | col1 | col2 | col3 |
+---+---------+------+------+------+
| 0 | count | 10 | 10 | 10 |
+---+---------+------+------+------+
| 1 | mean | 4 | 5 | 5 |
+---+---------+------+------+------+
| 2 | stddev | 3 | 3 | 3 |
+---+---------+------+------+------+
| 3 | min | 0 | -1 | 5 |
+---+---------+------+------+------+
| 4 | max | 100 | 56 | 47 |
+---+---------+------+------+------+
How can I keep only the columns where count > 5
, mean>4
and min>0
including the column summary
as well?
The desired output is:
+---+---------+------+
| | summary | col3 |
+---+---------+------+
| 0 | count | 10 |
+---+---------+------+
| 1 | mean | 5 |
+---+---------+------+
| 2 | stddev | 3 |
+---+---------+------+
| 3 | min | 5 |
+---+---------+------+
| 4 | max | 47 |
+---+---------+------+
Upvotes: 1
Views: 933
Reputation: 294218
query
(
df.set_index('summary')
.rename(str.title).T
.query('Count > 5 & Mean > 4 and Min > 0')
.T.rename(str.lower)
.reset_index()
)
summary col3
0 count 10
1 mean 5
2 stddev 3
3 min 5
4 max 47
(
df[['summary']].join(
df.iloc[:, 1:].loc[:, df.iloc[[0, 1, 3], 1:].T.gt([5, 4, 0]).all(1)]
)
)
summary col3
0 count 10
1 mean 5
2 stddev 3
3 min 5
4 max 47
Upvotes: 1
Reputation: 2757
loc
with callable.
(df.set_index('summary').T
.loc[lambda x: (x['count'] > 5) & (x['mean'] > 4) & (x['min'] > 0)]
.T.reset_index())
Upvotes: 2
Reputation: 1897
Set the summary
columns as the index and then do this:
df.T.query("(count > 5) & (mean > 4) & (min > 0)").T
Upvotes: 0
Reputation: 323226
Here is one way
s=df.set_index('summary')
com=pd.Series([5,4,0],index=['count','mean','min'])
idx=s.loc[com.index].gt(com,axis=0).all().loc[lambda x : x].index
s[idx]
Out[142]:
col3
summary
count 10
mean 5
stddev 3
min 5
max 47
Upvotes: 1
Reputation: 8631
You need:
df2 = df.set_index('summary').T
m1 = df2['count'] > 5
m2 = df2['mean'] > 4
m3 = df2['min'] > 0
df2.loc[m1 & m2 & m3].T.reset_index()
Output:
summary col3
0 count 10
1 mean 5
2 stddev 3
3 min 5
4 max 47
Note: You can easily use the conditions directly in .loc[]
, but when we have multiple conditions, it is best to use separate mask variables (m1
, m2
, m3
)
Upvotes: 3