gmarais
gmarais

Reputation: 1891

Pandas subtract columns not working with .iloc

I have a dataframe

test = pd.DataFrame({'col1':[1,2,3], 'col2':['a','b','c']})

test
Out[79]: 
   col1 col2
0     1    a
1     2    b
2     3    c

I am trying to compute the "first difference" of col1 explicitly using iloc but the results are nonsense:

test.iloc[1:,0] - test.iloc[:-1,0]
Out[80]: 
0    NaN
1    0.0
2    NaN
Name: col1, dtype: float64

I know I can use pandas.DataFrame.diff but I need to understand the machanics of iloc which is causing this to fail.

Upvotes: 2

Views: 963

Answers (1)

jezrael
jezrael

Reputation: 863341

Problem is with index values that are different between the two objects:

print (test.iloc[1:,0])
1    2
2    3
Name: col1, dtype: int64

print (test.iloc[:-1,0])

0    1
1    2
Name: col1, dtype: int64

Possible solution is create same index values:

a = test.iloc[1:,0].reset_index(drop=True) - test.iloc[:-1,0])
print (a)
0    1
1    1
Name: col1, dtype: int64

Or if length is always same convert one values to numpy array:

a = test.iloc[1:,0] - test.iloc[:-1,0].values
print (a)

1    1
2    1
Name: col1, dtype: int64

Upvotes: 1

Related Questions