Macter
Macter

Reputation: 132

Subtract each row from first row in dataframe

I have a dataframe (called df) with three variables, the head of which shown below. There are 600 rows of data.

         X1        X2        X3
0  0.049150  0.270032  0.577858
1  0.602387  0.065492  0.555747
2  0.598355  0.235002  0.482744
3  0.522151  0.253991  0.402630
4  0.402601  0.206630  0.553987

I am trying to subtract each row from the first. That is, I'm looking for row1 - row2, then row1 - row3, and so on. I am new to using for loops (and Python in general), and my current attempts aren't getting very far:

for i in range(len(df)):
    diff[i] = df.iloc[0,:] - df.iloc[i,:]
    diff2 = math.sqrt((diff[0])**2 + (diff[1])**2 + (diff[2])**2)
    print(diff2)

for context on the final three lines, I am trying to take the square root of the differences between each row item. So,

sqrt((row1col1-row2col1)^2 + (row1col2 - row2col2)^2 + (row1col3 - row2col3)^2)

and then I want to store the results of this for all the row differences up to row 600 in a new vector.

If you would like further context, I am trying to implement the second step of a "Subtractive Clustering" algorithm, the formula for which is as follows:

formula

where ra=1

Upvotes: 1

Views: 1310

Answers (1)

Anton vBR
Anton vBR

Reputation: 18916

First row can be accessed with iloc like this:

row1 = df.iloc[0]

And then we can use apply on row 1 to end:

df.iloc[1:].apply(lambda x: np.sqrt(sum((row1-x)**2)), axis=1).values

Returns:

array([ 0.59025138,  0.55848   ,  0.5046703 ,  0.35988505])

Upvotes: 2

Related Questions