Reputation: 73
I am working on a scala and want to replace special characters from my dataframe replaceAll
doesn't seem to work, is there any other way ?
My code is this:
val specialchar = dataframe.select(column).replaceAll("[^A-za-z]+","")
Upvotes: 0
Views: 1476
Reputation: 1917
You can provide the allowed characters in regex .
Try following
val badDF = Seq(("7369", "SMI_)(TH" , "2010-12-17", "800.00"), ("7499", "AL@;__#$LEN","2011-02-20", "1600.00")).toDF("empno", "ename","hire_date", "sal")
val cleanedDF = badDF.select(badDF.columns.map(c => regexp_replace(badDF(c), """[^A-Z a-z 0-9]""", "").alias(c)): _*)
cleanedDF.show
ename contains special characters. above regex will only allow Capital/Small a-z characters
and 0-9 digits
. All other characters will be removed.
Upvotes: 1