vishwajeet Mane
vishwajeet Mane

Reputation: 344

How to subtract two columns of pyspark dataframe and also divide?

I have dataframe like this..

dd1 : -

    A    B   
   2112  2637
   1293  2251
   1779  2435
   935   2473

I want to substract col B from col A and divide that ans by col A. Like this

    A    B       Result 
   2112  2637    -0.24
   1293  2251    -0.74
   1779  2435    -0.36
   935   2473   -1.64

Like (2112-2637)/2112 = -0.24

If it is not possible directly then 1st we can perform substract operation and store it new col then divide that col and store in another col.

Upvotes: 4

Views: 38466

Answers (1)

oreopot
oreopot

Reputation: 3450

General idea is like following:

dd1['Result'] = ( dd1['A'] - dd1['B'] ) / dd1['A']

In case of Pyspark, it would look something like:

dd1 = dd1.withColumn('Result', ( dd1['A'] - dd1['B'] ) / dd1['A'] )

Upvotes: 12

Related Questions