test acc
test acc

Reputation: 591

scala convert string to a special character

So I am trying to read an escaped character from a file, It is a long and complicated process due to a lot of cleansing but that is all irrelevant. The end product is this property of an object -

props.inputSeperator: String type

Now this is a STRING. However, the value of this string in this specific case is \u0001

When I print this, the output is \u0001. And the length of the string props.inputSeperator is 6. How do I convert this string, into a string of a single character? Which would be the special character represented by \u0001 So the length of the string would be 1, and when printed, would print a single special character (\u0001)

val x: String = "\u0001"
val s = Array("\\", "u", "0", "0", "0", "1").mkString("")
println(x) //prints "?"   this is a SINGLE special character
println(s) //prints "\u0001"

I want to take s, and make it into the value of x essentially.

Upvotes: 1

Views: 2617

Answers (4)

Agrim Bansal
Agrim Bansal

Reputation: 11

val delim :Byte = "\u0007".codePointAt(0).toByte

We can use codePointAt() method then use toByte

Upvotes: 0

stack0114106
stack0114106

Reputation: 8711

You have the UNICODE value in ascii literals. To get the unicode value, you need to just ignore the "\" and "u" and read the rest of the string as hex values using sliding(2,2) format. Then pass the resulting string to a "new String", by specifying the encoding that you need i.e UNICODE.

scala> val ar = Array("\\", "u", "0", "0", "0", "1").mkString("")
ar: String = \u0001

scala> val x = new String( ar.drop(2).sliding(2,2).toArray.map(Integer.parseInt(_, 16).toByte) , "UNICODE")
x: String = ?

scala>  x.length
res53: Int = 1

scala>  x.toArray.map(_.toByte)
res54: Array[Byte] = Array(1)

scala>

Verification:

scala> val x1: String = "\u0001"
x1: String = ?

scala> x==x1
res55: Boolean = true

scala>

Upvotes: 0

jwvh
jwvh

Reputation: 51271

Strip the unwanted characters, parse the hex string, turn into Char.

Integer.parseInt("\\u0A6E".drop(2), 16).toChar
res0: Char = ੮

Upvotes: 3

Andrey Tyukin
Andrey Tyukin

Reputation: 44918

Just use the method unescapeJava from commons.text.StringEscapeUtils:

libraryDependencies += "org.apache.commons" % "commons-text" % "1.4"

Example:

println(org.apache.commons.text.StringEscapeUtils.unescapeJava("\\u046C"))

prints:

Ѭ

Upvotes: 3

Related Questions