user3050590
user3050590

Reputation: 1716

select values based on condition on multiple columns for pandas dataframe in python

Here is my dataframe dim

          var         types     count
0         var1       nominal      1
1         var2       ordinal      1
2         var3  quantitative      2
3         var4  quantitative      2

I want to get the dim["var"] where dim["types"] == quantitative and dim["count"] > 1. The result then is a list [var3, var4]. When I am trying the following query:

print(dim["var"].where((dim["types"] =="quantitative") & (dim["count"] > 1)))

I am getting the following result:

0    NaN
1    NaN
2    NaN
3    NaN

I don't know how can I get the desired solution.

Upvotes: 2

Views: 83

Answers (2)

jezrael
jezrael

Reputation: 862641

Use DataFrame.loc with mask:

L = dim.loc[(dim["types"] =="quantitative") & (dim["count"] > 1), "var"].tolist()
print (L)
['var3', 'var4']

Your output is correct, because Series.where convert where condition is False values to missing values:

print ((dim["types"] =="quantitative") & (dim["count"] > 2))
0    False
1    False
2    False
3    False
dtype: bool

print(dim["var"].where((dim["types"] =="quantitative") & (dim["count"] > 2)))
0    NaN
1    NaN
2    NaN
3    NaN
Name: var, dtype: object

So if use == in condition output is:

print ((dim["types"] =="quantitative") & (dim["count"] > 1))
0    False
1    False
2     True
3     True
dtype: bool

print(dim["var"].where((dim["types"] =="quantitative") & (dim["count"] > 1)))
0     NaN
1     NaN
2    var3
3    var4
Name: var, dtype: object 

Upvotes: 2

timgeb
timgeb

Reputation: 78690

Use the loc accessor with your mask.

>>> (dim["types"] == "quantitative") & (dim["count"] > 1)
0    False
1    False
2     True
3     True
dtype: bool

Like this:

>>> dim.loc[(dim["types"] == "quantitative") & (dim["count"] > 1), 'var']
2    var3
3    var4
Name: var, dtype: object

Upvotes: 1

Related Questions