Reputation: 595
I am using HiveQL in spark and woul like to fill null values by the mean of the column in spark.
Using below codes:
StringBuilder query = new StringBuilder("select `ts0` as ts ");
String[] cols = dataFrame.columns();
for (String col : cols) {
query.append(",`" + col + "` as " + trimmedCol);
}
}
I think I should use "case" command when there is a null value. Can anyone guide me how to do above?
Upvotes: 0
Views: 94
Reputation: 452
You could to try this following
scala> val df = sqlContext.read.format("com.databricks.spark.csv").option("header","true").option("inferSchema","true").load("na_test.csv")
scala> df.show()
scala> df.na.fill(10.0,Seq("age"))
scala> df.na.fill(10.0,Seq("age")).show
scala> df.na.replace("age", Map(35 -> 61,24 -> 12))).show()
Upvotes: 1