Reputation: 1459
I have a dataframe with columns names that has dot
.
Example : df.printSchema
user.id_number
user.name.last
user.phone.mobile
etc and I want to rename the schema by replacing the dot
with _
.
user_id_number
user_name_last
user_phone_mobile
Note: the input data for this DF is JSON format (with nonrelational like NoSQL
)
Upvotes: 2
Views: 991
Reputation: 31520
Use either .map,.withColumnRenamed
to replace .
with _
Example:
val df=Seq(("1","2","3")).toDF("user.id_number","user.name.last","user.phone.mobile")
df.toDF(df.columns.map(x =>x.replace(".","_")):_*).show()
//using replaceAll
df.toDF(df.columns.map(x =>x.replaceAll("\\.","_")):_*).show()
//+--------------+--------------+-----------------+
//|user_id_number|user_name_last|user_phone_mobile|
//+--------------+--------------+-----------------+
//| 1| 2| 3|
//+--------------+--------------+-----------------+
2. Using selectExpr:
val expr=df.columns.map(x =>col(s"`${x}`").alias(s"${x}".replace(".","_")).toString)
df.selectExpr(expr:_*).show()
//+--------------+--------------+-----------------+
//|user_id_number|user_name_last|user_phone_mobile|
//+--------------+--------------+-----------------+
//| 1| 2| 3|
//+--------------+--------------+-----------------+
3.Using .withColumnRenamed:
df.columns.foldLeft(df){(tmpdf,col) =>tmpdf.withColumnRenamed(col,col.replace(".","_"))}.show()
//+--------------+--------------+-----------------+
//|user_id_number|user_name_last|user_phone_mobile|
//+--------------+--------------+-----------------+
//| 1| 2| 3|
//+--------------+--------------+-----------------+
Upvotes: 2