Jon
Jon

Reputation: 3194

Write JSON file as UTF-8 encoded

I am writing a method that writes some JSON to a file, which works fine. However, although I have set the output to be UTF-8, Oxygen fails to read a pound and euro sign.

Java code:

Path logFile = Paths.get(this.output_folder + "/" + file.getName().split("\\.")[0] + ".json");
try (BufferedWriter writer = Files.newBufferedWriter(logFile, StandardCharsets.UTF_8)) {
    File fileDir = new File("test.json");
    Writer out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(fileDir), "UTF8"));
    ObjectMapper mapper = new ObjectMapper();
    writer.write(mapper.writerWithDefaultPrettyPrinter().writeValueAsString(all_questions));
}

"all_questions" is an arraylist of Question objects, which is being printed as a formatted piece of JSON by an ObjectMapper.

Some sample JSON with the pound sign looks like this:

{
      "name" : "RegExRule",
      "field" : "Q039_4",
      "rules" : [ ],
      "fileName" : "s1rules_england_en.xml",
      "error" : null,
      "pattern_match" : {
        "$record.ApplicationData.SiteVisit.VisitContactDetails.ContactOther.PersonName.PersonGivenName" : "^[\\u0000-\\u005F\\u0061-\\u007B\\u007d-\\u007f£€]*$"
      }
}

However, that is displayed in notepad++. In Oxygen, it is displayed as follows:

"pattern_match" : {
        "$record.ApplicationData.SiteVisit.VisitContactDetails.ContactOther.PersonName.PersonGivenName" : "^[\\u0000-\\u005F\\u0061-\\u007B\\u007d-\\u007f£€]*$"
 }

Upvotes: 3

Views: 23504

Answers (1)

Remy Lebeau
Remy Lebeau

Reputation: 595971

When constructing the OutputStreamWriter object, you need to use "UTF-8" as the charset name, not "UTF8":

new OutputStreamWriter(..., "UTF-8")

Alternatively, use StandardCharsets.UTF_8 instead:

new OutputStreamWriter(..., StandardCharsets.UTF_8)

Java does not generally support reading/writing BOMs, so if you want your JSON file to have a UTF-8 BOM then you will have to write one manually:

Writer out = ...;
out.write("\uFEFF");
out.write(... json content here ...); 

FYI, PrintWriter can manage the OutputStreamWriter and FileOutputStream objects for you:

Writer out = new PrintWriter(fileDir, "UTF-8");

Or:

Writer out = new PrintWriter("test.json", "UTF-8");

Lastly, why are you creating a BufferedWriter using Files.newBufferedWriter() only to ignore it and create a secondary BufferedWriter manually? Why not just use the BufferedWriter that you already have:

Path logFile = Paths.get(this.output_folder + "/" + file.getName().split("\\.")[0] + ".json");
try (BufferedWriter writer = Files.newBufferedWriter(logFile, StandardCharsets.UTF_8)) {
    writer.write("\uFEFF");
    ObjectMapper mapper = new ObjectMapper();
    writer.write(mapper.writerWithDefaultPrettyPrinter().writeValueAsString(all_questions));
}

Upvotes: 5

Related Questions