set column = some operation of other column values in a pandas dataframe structure

Question

I'm trying to perform a simple operation of a pandas dataframe using the logic below. The values in the columns of interest are decimals (up to 1 decimal point). The value of the operation cannot be negative so in case it is I want 0 instead. I've tried 2 methods to achieve this but both methods result in the same error.

method ONE:

def compute_size(frame):
    for x in list(reversed(range(14, len(frame.columns),2))):
        tmp_value = frame.iloc[:,x] - frame.iloc[:,x-2]
        if tmp_value < 0:
            frame.iloc[:,x] = 0
        else:
            frame.iloc[:,x] = tmp_value

method TWO:

def compute_size(frame):
    for x in list(reversed(range(14, len(frame.columns),2))):
        frame.iloc[:,x] = max(0,frame.iloc[:,x] - frame.iloc[:,x-2])

When I call the function above I get the following error:

C:\Python27\lib\site-packages\pandas\core\generic.pyc in __nonzero__(self)
    690         raise ValueError("The truth value of a {0} is ambiguous. "
    691                          "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
--> 692                          .format(self.__class__.__name__))
    693 
    694     __bool__ = __nonzero__

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Update 1:

Here is some sample data:

df = pd.DataFrame({
'BlahBlah0' : ['','','',''],
'BlahBlah1' : ['','','',''],
'BlahBlah2' : ['','','',''],
'BlahBlah3' : ['','','',''],
'BlahBlah4' : ['','','',''],
'BlahBlah5' : ['A','C','E','G'],
'BlahBlah6' : ['B','D','F','H'],
'BlahBlah7' : ['','','',''],
'BlahBlah8' : ['','','',''],
'BlahBlah9' : ['','','',''],
'BlahBlah10' : ['','','',''],
'BlahBlah11' : ['','','',''],
'Size1':[1,1,1,1], 
'Price1':[50,50,50,50],
'Size2':[3,3,3,3], 
'Price2':[75,75,75,75], 
'Size3':[7,7,7,7], 
'Price3':[100,100,100,100], 
'Size4':[15,15,15,15], 
'Price4':[125,125,125,125],
'Size5':[25,25,25,25], 
'Price5':[200,200,200,200],
'Size6':[30,30,30,30], 
'Price6':[250,250,250,250],
'Size7':[40,40,40,40], 
'Price7':[300,300,300,300]
},columns=['BlahBlah0', 
'BlahBlah1',
'BlahBlah2',
'BlahBlah3',
'BlahBlah4',
'BlahBlah5',
'BlahBlah6',
'BlahBlah7',
'BlahBlah8',
'BlahBlah9',
'BlahBlah10',
'BlahBlah11',
'Size1',
'Price1',
'Size2',
'Price2',
'Size3',
'Price3',
'Size4',
'Price4',
'Size5',
'Price5',
'Size6',
'Price6',
'Size7',
'Price7'] )

Now, once you insert the above dataframe into python the ordering of the columns gets out of wack. For some reason pandas groups the Price columns together and the Size columns together. This is not the intention. The dataframe should look exactly as I show it. I'm not sure how you can manipulate it back to the way its shown above.

Assuming you are able to produce the exact dataframe as depicted above, now I want to perform the following operation:

Size1 = Size1
Size2 = Max(0,Size2 - Size1)
Size3 = Max(0,Size3 - Size2)
Size4 = Max(0,Size4 - Size3)
Size5 = Max(0,Size5 - Size4)
Size6 = Max(0,Size6 - Size5)
Size7 = Max(0,Size7 - Size6)

So the logic doesn't always subtract column x-2 from column x but only performs the operation for every other column from column index 14 to the last column.

Update 2:

I fixed the part regarding the ordering of the dataframe (see above).

The desired output based on the logic described above is the following dataframe:

df = pd.DataFrame({
'BlahBlah0' : ['','','',''],
'BlahBlah1' : ['','','',''],
'BlahBlah2' : ['','','',''],
'BlahBlah3' : ['','','',''],
'BlahBlah4' : ['','','',''],
'BlahBlah5' : ['A','C','E','G'],
'BlahBlah6' : ['B','D','F','H'],
'BlahBlah7' : ['','','',''],
'BlahBlah8' : ['','','',''],
'BlahBlah9' : ['','','',''],
'BlahBlah10' : ['','','',''],
'BlahBlah11' : ['','','',''],
'Size1':[1,1,1,1], 
'Price1':[50,50,50,50],
'Size2':[2,2,2,2], 
'Price2':[75,75,75,75], 
'Size3':[4,4,4,4], 
'Price3':[100,100,100,100], 
'Size4':[8,8,8,8], 
'Price4':[125,125,125,125],
'Size5':[10,10,10,10], 
'Price5':[200,200,200,200],
'Size6':[5,5,5,5], 
'Price6':[250,250,250,250],
'Size7':[10,10,10,10], 
'Price7':[300,300,300,300]
},columns=['BlahBlah0', 
'BlahBlah1',
'BlahBlah2',
'BlahBlah3',
'BlahBlah4',
'BlahBlah5',
'BlahBlah6',
'BlahBlah7',
'BlahBlah8',
'BlahBlah9',
'BlahBlah10',
'BlahBlah11',
'Size1',
'Price1',
'Size2',
'Price2',
'Size3',
'Price3',
'Size4',
'Price4',
'Size5',
'Price5',
'Size6',
'Price6',
'Size7',
'Price7'] )

I'm calculating the delta between the current size and the previous size, which essentially captures the net additional size at the new price.

Bob Haffner · Accepted Answer

Give this a try. It does an apply() along axis 1 with a list comprehension to handle the subtraction.

cols_to_update = ['Size2','Size3','Size4','Size5','Size6','Size7']
cols_to_subtract = ['Size1','Size2','Size3','Size4','Size5','Size6','Size7']

df[cols_to_update] = df[cols_to_subtract].apply(
          lambda x : pd.Series([max(x[i] - x[i-1],0) for i in range(1,len(x))]),axis=1)

set column = some operation of other column values in a pandas dataframe structure

Answers (1)

Related Questions