Shekhar
Shekhar

Reputation: 315

Setting null as value in numeric data types in Hive table using spark

I am looking for a solution where I need to populate null in numeric fields which have no value in it using Spark. The code which I have written is as below:

val myDF = sqlContext.sql("from mystg.my_acct_table select id,amt,tot_amt where id=12345")
myDF.map(row => {val row1 = row.getAs[Double]("tot_amt")
      val my_tot_amt = {
          if(row1.isNaN())
              null.asInstanceOf[Double]
          else
              Math.abs(row1.toDouble)
        }
    Row(row(0),row(1),my_tot_amt)
   })

I also tried by putting the above logic in Row(row(0),row(1),my_tot_amt).

  Row(row(0),row(1),(if(row1.isNaN())
                    null.asInstanceOf[Double]
                else
                    Math.abs(row1.toDouble)))

But I am getting the output as | 8687992537618075 | 1163.35 | 0.0 |

The expected output is | 8687992537618075 | 1163.35 | null |

Upvotes: 0

Views: 479

Answers (1)

Alexey Romanov
Alexey Romanov

Reputation: 170723

null.asInstanceOf[Double] is 0.0. Just don't cast it:

val my_tot_amt = if(row1.isNaN())
  null
else
  Math.abs(row1.toDouble)

should work (this makes the type of my_tot_amt Any; val my_tot_amt: java.lang.Double = ... is more precise, but shouldn't be necessary in this specific case).

Also, I think you'll get 0.0 where row("tot_amt") is already null; is this intentional?

Upvotes: 1

Related Questions