guille eh
guille eh

Reputation: 9

How to compare a value with columns values in a pandas dataframe

        000012  000013   000014   ...    004004  005585  007682
0          0     3.8      3.7   ...       1.1     4.8     0.4
1          0       0      0.0   ...       0.0       5     7.8
2          0       0      0.0   ...       0.0     1.6     2.1
3          0       0      2.0   ...       2.3       0     0.4
4          0       0      1.3   ...       0.2     1.3     0.1
5          0       0      0.0   ...       0.0     4.1     3.5
6          0       0      0.0   ...       0.6     0.2     0.3
7          0       0      0.0   ...       0.0       0     7.1
8          0       0      0.0   ...       0.0       0     0.0

I have something like this. I need compare each column value to know how many times appears values greater than 1 in each column.

I have tried this:

s.set_index(s.index).gt(1).sum(1).reset_index(name='result').fillna(s)

but it gets an error: Could not operate 1 with block values '>' not supported between instances of 'numpy.ndarray' and 'int'

The values of the columns are floats.

Someone knows like I solve it?? Thanks!

Upvotes: 1

Views: 7202

Answers (3)

Ikbel
Ikbel

Reputation: 2203

Try this :

import pandas as pd
dc={}  #The keys will identify the column name and its value differentiate how many times appears values greater than 1 .

    for i in list(dataframe.columns.values):
      dc[i] =  dataframe.loc[dataframe[i].gt(1),i].count()  

Upvotes: 0

BENSALEM Mouhssen
BENSALEM Mouhssen

Reputation: 71

please try this code:

import pandas as pd
import numpy as np
datan = np.random.randn(36).reshape(9, 4)
df = pd.DataFrame(data=datan, columns=list("ABCD"))
output = {}
for c in df.columns:
output[c] = df[c][df[c] >= 1].sum()
df2 = pd.DataFrame(output, index=[0])
df2

Upvotes: 0

LoopingDev
LoopingDev

Reputation: 844

i can't give you the exact code as your table isn't clear but you can try using query():-

df_filtered = df.query('a > 1')

where a is the Header of the column you are trying to filter.

to add multiple conditions you can use & in between each column

df_filtered = df.query('a > 1 & b > 1')

Upvotes: 1

Related Questions