Govind Singh
Govind Singh

Reputation: 15490

StringEscapeUtils.escapeJava

I am getting \u1F44A\u1F44A where i am expecting \ud83d\udc4d\ud83d\udc4d.

import org.apache.commons.lang3.StringEscapeUtils

val data="👍👍"

println(StringEscapeUtils.escapeJava(data))//\u1F44A\u1F44A

println(StringEscapeUtils.unescapeJava("\u1F44A\u1F44A"))//ὄAὄA

println(StringEscapeUtils.unescapeJava("\ud83d\udc4d\ud83d\udc4d"))//👍👍

how i get this \ud83d\udc4d\ud83d\udc4d ?

Upvotes: 0

Views: 2489

Answers (3)

Amit Singh
Amit Singh

Reputation: 3063

I don't think that we need the Apache Commons Library for this. We can easily achieve this in Scala using the standard libraries available.

val data: String ="👍👍"

println(System.getProperty("file.encoding", "No encoding")))
// prints UTF-8

println(data.map(x => "\\u%04x".format(x.toInt)).mkString)
// prints \ud83d\udc4d\ud83d\udc4d

You can set your encoding by setting file.encoding parameter in the JVM config.

Tested on Scastie for Scala version 2.13.3.

Upvotes: 0

xwd
xwd

Reputation: 11

👍 Unicode: U+1F44D

UTF-16BE: D8 3D DC 4D

you can see that 1F44D uincode table

So

println(StringEscapeUtils.escapeJava(data))//\u1F44A\u1F44A
println(StringEscapeUtils.unescapeJava("\ud83d\udc4d\ud83d\udc4d"))//👍👍

Maybe the IDE console window uses utf-16be? Eclipse can set the console window to use utf-16be or other

enter image description here

Upvotes: 1

Karol S
Karol S

Reputation: 9402

It's a bug in Apache Commons-Lang 3.0 and 3.1. I think it's fixed in 3.2.0, so upgrade to 3.2.x or 3.3.x.

Upvotes: 0

Related Questions