Reputation: 2710
I have a Scalatest for a piece of API written using the Spray framework that looks like the following:
"correctly deserializes multi-lang title metadata" in {
implicit def json4sFormats: org.json4s.Formats = ModelJsonHelper.jsonFormats
val v2MultiLangTitle = getStringFromResource("/json_samples/cmp.asset.v2.AssetWriteMultiMetadata.json")
//Deserialize
val v2AssetEither = HttpEntity(`application/cmp.ela.assetWrite.v2+json`, v2MultiLangTitle).as[Asset]
v2AssetEither.isRight shouldEqual true
v2AssetEither.right.map(asset => {
asset.assetMetadata.getOrElse(List()).size shouldEqual(3)
asset.assetMetadata.getOrElse(List())(1).language shouldEqual("es")
asset.assetMetadata.getOrElse(List())(1).data.title shouldEqual(Some("Encabezado prueba de AFP"))
asset.assetMetadata.getOrElse(List())(2).language shouldEqual("tlh")
asset.assetMetadata.getOrElse(List())(2).data.title shouldEqual(Some("Daj jaw AFP"))
asset.assetMetadata.getOrElse(List())(0).language shouldEqual("de")
asset.assetMetadata.getOrElse(List())(0).data.title shouldEqual(Some("Test AFP Überschrift"))
})
}
def getStringFromResource(input: String): String = {
Source.fromInputStream(this.getClass.getResourceAsStream(input))(Codec.UTF8).getLines.mkString("")
}
The json being processed is the following
{
"assetMetadata" : [
{
"title": "Test AFP Überschrift",
"language": "de"
},
{
"title": "Encabezado prueba de AFP",
"language": "es"
},
{
"title": "Daj jaw AFP",
"language": "tlh"
}
]
}
and the failure is occurring on the German umlaut:
[info] - correctly deserializes multi-lang title metadata *** FAILED ***
[info] Some("Test AFP �berschrift") did not equal "Test AFP Überschrift" (V2AssetWriteMarshallerSpec.scala:367)
Under the surface, there is some json4s that is used in the unwrapping process. However if I use the json4s parser directly, the German character is processed correctly:
import scala.io.Source
import java.io.FileInputStream
val v = Source.fromInputStream(new FileInputStream("/path/to/project/src/test/resources/json_samples/cmp.asset.v2.AssetWriteMultiMetadata.json")).getLines.mkString("")
import org.json4s._
import org.json4s.native.JsonMethods._
val obj = parse(v)
(obj \ "assetMetadata")
Gives me the result:
res0: org.json4s.JValue = JArray(List(JObject(List((title,JString(Test AFP Überschrift)), ....
I'm on Spray 1.3.3 and json4s 3.2.10. That type in HttpEntity
is a custom type and I've tried adding ( "charset" -> "UTF-8" )
as a parameter like so:
val `application/cmp.ela.assetWrite.v2+json` = register(
MediaType.custom(
mainType = "application",
subType = "cmp.ela.assetWrite.v2+json",
compressible = true,
binary = false,
parameters = Map[String,String]( "charset" -> "UTF-8" )
)
)
..but the test still fails with the invalid character. How do I get Spray to correctly unmarshall a string with international characters in it?
Upvotes: 1
Views: 558
Reputation: 2586
I helped you find the solution in the Scala IRC channel. I found this link and you did the rest!
https://github.com/spray/spray/blob/master/spray-http/src/main/scala/spray/http/MediaType.scala#L168
Fixed test:
//Deserialize
val v2AssetEither = HttpEntity(`application/vnd.dsa.assetWrite.v2+json` withCharset(HttpCharsets.`UTF-8`), v2MultiLangTitle).as[Asset]
v2AssetEither.isRight shouldEqual true
The key being adding
withCharset(HttpCharsets.`UTF-8`)
Upvotes: 1