how to generate unique sequence numbers to replace null values in a column of a table in spark scala

Question

I am facing difficulty in generating unique sequence numbers to replace the null values in a column of a table. The table is obtained after joining two other tables and the column id the primary key column where null values are to be replaced with unique sequence values. I tried using accumulators but i am facing difficulty when running the program in a multinode cluster.

val joined=csv2.join(csv,csv2("ACCT_PRDCT_CD")===csv("ACCT_PRDCT_CD"),"left_outer")

joined.filter("ACCT_CO_NO is null").show

val k=joined.withColumn("Acc_flag", when($"ACCT_CO_NO".isNull,0).otherwise($"ACCT_CO_NO"))

var a=1

def generate(s:Int):Int={
   if (s==0){
             a=a+1
             return a
            }
   else     {
             return s
            }
   }

val generateNum = udf(generate(_:Int))

val newjoined=k.withColumn("n",generateNum($"ACC_flag"))

how to generate unique sequence numbers to replace null values in a column of a table in spark scala

Answers (1)

Related Questions