gibbz00
gibbz00

Reputation: 1987

Comparing non-identical pandas dataframe with a series object

I have the following pandas.core.series.Series:

Color
Red      4
Green    7

and also the following multiindex dataframe. My goal is to create the Target column in the dataframe by checking if the Value column in the dataframe is less than the corresponding color value in the pandas.core.series.Series and return 1 if that is the case. For example, in the first row, value in Value column within the dataframe is 12 which is more than the corresponding matched index value of 4 in the pandas series object and thus Target returns 0.

              Value    Target
Color Animal       
Red   Tiger      12      0
      Tiger      3       1
Green Lion       6       1
      Lion       35      0

My following attempt gets a ValueError: Can only compare identically-labeled Series objects.

import pandas as pd
import numpy as np
x = pd.Series([4,7], index=['Red','Green'])
x.index.name = 'Color'

dt = pd.DataFrame({'Color': ['Red','Red','Green','Green'], 'Animal': ['Tiger','Tiger','Lion','Lion'],  'Value': [12,3,6,35]})
dt.set_index(['Color','Animal'], inplace=True)
dt['Target'] = np.where(dt['Value'] < x ,1 ,0 )

Upvotes: 1

Views: 429

Answers (1)

cs95
cs95

Reputation: 402333

Use lt instead of the operator, and specify the axis.

dt['Target'] = dt['Value'].lt(x, axis=0).astype(int)
print (dt)
              Value  Target
Color Animal               
Red   Tiger      12       0
      Tiger       3       1
Green Lion        6       1
      Lion       35       0

lt = "lesser than"

Upvotes: 5

Related Questions