sundar_ima
sundar_ima

Reputation: 3890

Count occurance of values within condition between column or rows

I have a data set which goes like this:-

2017-03-01  31.8  28.0     32.6
2017-04-01  31.6  28.0     32.6
2017-05-01  31.0  27.0     32.6
2017-06-01  31.0  27.0     32.4
2017-07-01  31.0  27.0     31.4
2017-08-01  30.0  27.0     32.6

Apart from first column, rest of column are temperature. What I would like to do is to compare the values of 4th column (last from right) with other column values to find if temperature values are not more or less than two deg (of 4th column). For example, I would like to count how many times all three columns (row wise) has the value between 30.6 to 34.6.

Is there a function available under pandas to do that?

Upvotes: 0

Views: 93

Answers (4)

Zero
Zero

Reputation: 76917

Values of columns a b that is less than more than 2 of c

In [726]: (df[['a', 'b']].sub(df['c'], axis=0).abs() < 2).all(1).sum()
Out[726]: 0

In [727]: (df[['a', 'b']].sub(df['c'], axis=0).abs() < 2)
Out[727]:
       a      b
0   True  False
1   True  False
2   True  False
3   True  False
4   True  False
5  False  False

Values between 30.6 to 34.6

In [671]: (df[['a', 'b', 'c']] > 30.6) & (df[['a', 'b', 'c']] < 34.6)
Out[671]:
       a      b     c
0   True  False  True
1   True  False  True
2   True  False  True
3   True  False  True
4   True  False  True
5  False  False  True

Values between 30.6 to 34.6, for all columns in a row to be True

In [672]: ((df[['a', 'b', 'c']] > 30.6) & (df[['a', 'b', 'c']] < 34.6)).all(1)
Out[672]:
0    False
1    False
2    False
3    False
4    False
5    False
dtype: bool

Count of rows where values are between 30.6 to 34.6 for all columns

In [673]: ((df[['a', 'b', 'c']] > 30.6) & (df[['a', 'b', 'c']] < 34.6)).all(1).sum()
Out[673]: 0

Upvotes: 2

DYZ
DYZ

Reputation: 57033

If I understand your question correctly, you want to know how many times all of the values are withing the range ±2 of the fourth column (the columns are numbered 0,1,2,3):

(((df[3] - df[1]).abs() < 2) & ((df[3] - df[2]).abs() < 2)).sum()
#0

Upvotes: 2

BENY
BENY

Reputation: 323226

Base on you sample data , you can try :

df2[['t1','t2','t3']].apply(lambda x : abs(x-df2['t3'])<2).sum(axis=1)==3

Out[425]: 
0    False
1    False
2    False
3    False
4    False
5    False
dtype: bool

Data Input

    df2
    Out[426]: 
             Time    t1  t2    t3
    0  2017-03-01  31.8  28  32.6
    1  2017-04-01  31.6  28  32.6
    2  2017-05-01  31.0  27  32.6
    3  2017-06-01  31.0  27  32.4
    4  2017-07-01  31.0  27  31.4
    5  2017-08-01  30.0  27  32.6

Upvotes: 2

JCVanHamme
JCVanHamme

Reputation: 1050

You could use dataframe.apply to perform the comparison for each row, and sum the results of the comparison.

import pandas

df = pandas.DataFrame(
    data=[
        [31.8,  28.0,     32.6],
        [31.6,  28.0,     32.6],
        [31.0,  27.0,     32.6],
        [31.0,  27.0,     32.4],
        [31.0,  27.0,     31.4],
        [30.0,  27.0,     32.6]
    ],
    index=['2017-03-01', '2017-04-01', '2017-05-01', '2017-06-01', '2017-07-01', '2017-08-01']
)

df['count'] = df.apply(lambda x: sum((x > 30.6) & (x < 34.6)), axis=1)
print(df)

Since True == 1 and False == 0, you'll get a count of the columns that meet your criteria

Upvotes: 1

Related Questions