Tommy_SK
Tommy_SK

Reputation: 87

Replacing null values in a column in Pyspark Dataframe

I need to replace null values present in a column in Spark dataframe. Below is the code I tried

df=df.na.fill(0,Seq('c_amount')).show()

But it is throwing me an error NameError: name 'Seq' is not defined

Below is my table

   +------------+--------+
   |c_account_id|c_amount|
   +------------+--------+ 
   |           1|    null|    
   |           2|    123 |
   |           3|    null|
   +------------+--------+

Expected output

   +------------+--------+
   |c_account_id|c_amount|
   +------------+--------+ 
   |           1|       0|    
   |           2|     123|
   |           3|       0|
   +------------+--------+

Upvotes: 1

Views: 2450

Answers (1)

dsk
dsk

Reputation: 2003

You need to use like this

df = df.fillna("<BLANK>", subset=['col_name'])

Upvotes: 1

Related Questions