GhostRider
GhostRider

Reputation: 2170

Pandas: Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

I am trying to create a new column in a pandas dataframe using a function that takes two columns as arguments

def ipf_cat(var, con):
    if var == "Idiopathic pulmonary fibrosis":
       if con in range(95,100):
          result = 4
       if con in range(70,95):
          result = 3
       if con in range(50,70):
          result = 2
       if con in range(0,50):
          result = 1
    return result

And then

   df['ipf_category'] = ipf_cat(df['dx1'], df['dxcon1'])

Where df['dx1'] is one column and a string and df['dxcon1'] is another column and an integer from 0-100. The function works fine in python but I keep getting this error

 ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I have seen previous answers such as

Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

but I can't implement these solutions to my particular function.

Upvotes: 0

Views: 2181

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210812

I'd use pd.cut() method:

Source DF

In [157]: df
Out[157]:
  con                            var
0  53                            ???
1  97  Idiopathic pulmonary fibrosis
2  75                            ???
3  11  Idiopathic pulmonary fibrosis
4  70                            ???
5  52  Idiopathic pulmonary fibrosis
6  74                            ???
7  25  Idiopathic pulmonary fibrosis
8  92                            ???
9  80  Idiopathic pulmonary fibrosis

Solution:

In [158]: df['ipf_category'] = -999
     ...:
     ...: bins = [-1, 50, 70, 95, 101]
     ...: labels = [1,2,3,4]
     ...:
     ...: df.loc[df['var']=='Idiopathic pulmonary fibrosis', 'ipf_category'] = \
     ...:     pd.cut(df['con'], bins=bins, labels=labels)
     ...:

In [159]: df
Out[159]:
  con                            var  ipf_category
0  53                            ???          -999
1  97  Idiopathic pulmonary fibrosis             4
2  75                            ???          -999
3  11  Idiopathic pulmonary fibrosis             1
4  70                            ???          -999
5  52  Idiopathic pulmonary fibrosis             2
6  74                            ???          -999
7  25  Idiopathic pulmonary fibrosis             1
8  92                            ???          -999
9  80  Idiopathic pulmonary fibrosis             3

Setup:

df = pd.DataFrame({
  'con':np.random.randint(100, size=10),
  'var':np.random.choice(['Idiopathic pulmonary fibrosis','???'], 10)
})

Upvotes: 1

Related Questions