How to update the value of a column in Spark dataset using java?

Question

I have loaded a dataset by using :

Dataset rows = sparkSession.read().format("com.databricks.spark.csv").option("header", "true").load(tablenameAndLocationMap.get(tablename));

The data is getting loaded correctly but I am looking to update the column value at runtime. I tried using looping as mentioned but it didn't work .

Column data = rows.col("UPLOADED_ON");
Dataset d = rows.select(data);
            
d.foreach(obj->{
    String date = obj.getAs(0);
    DateFormat inputFo  formatter = new SimpleDateFormat("yyyy-MM-dd");
    Date da = (Date)inputFormatter.parse(date);
    
    DateFormat outputFormatter = new SimpleDateFormat("dd-MM-yy");
    date = outputFormatter.format(da);
});

Here I want to replace/update the existing value of column UPLOADED_ON with the new value date.

How it can be done , if anyone can help out .

Thanks

How to update the value of a column in Spark dataset using java?

Answers (1)

Related Questions