Reputation: 177
I have some column names in mixed cases in my Dataframe like sum(TXN_VOL)
I want to convert them to uppercase like SUM(TXN_VOL)
I won't be knowing all the column names so I cant convert them using hard coding.
Either I have to iterate through all column names and convert each of them to UPPER CASE. OR there is any built in functionality to change all column names to UPPER CASE
What I tried is :
String[] columnNames = finalBcDF.columns();
Dataset<Row> x = null;
for(String columnName : columnNames) {
x = finalBcDF.withColumnRenamed(columnName, columnName.toUpperCase());
}
But this will create new Dataframe each time so, This won't give desired result.
I have checked on many site but I am not able to see how can I do so in Java.
Can anyone help here?
EDIT
In one of the answers :
How to lower the case of column names of a data frame but not its values?
answer is given for Scala and PySpark but I am not able to convert it to Java, can anyone help?
Upvotes: 1
Views: 4349
Reputation: 15297
Here is how you can convert the column names to upper case using Java 8
.
import static org.apache.spark.sql.functions.col;
import org.apache.spark.sql.Column;
df.select(Arrays.asList(df.columns()).stream().map(x -> col(x).as(x.toUpperCase())).toArray(size -> new Column[size])).show(false);
Upvotes: 2
Reputation: 1446
Iterating would be good to go approach. Even though new DataFrame java class instance is created. Since spark evaluated lazily so there will be no performance penalty.
Reference: https://data-flair.training/blogs/apache-spark-lazy-evaluation/
Upvotes: 0