Reputation: 99
I am not a powershell guy please excuse if my question is confusing.
We are creating a JSON file using ConverTo-JSON and it successfully creates the JSON file. However when I cat the contents of JSON it has '??' at the beginning of the json file but the same is not seen when I download the file/ view the file in file system.
Below is the powershell code which is used to create the JSON File:
$packageJson = @{
packageName = "ABC.DEF.GHI"
version = "1.1.1"
branchName = "somebranch"
oneOps = @{
platform = "XYZ"
component = "JNL"
}
}
$packageJson | ConvertTo-Json -depth 100 | Out-File "$packageName.json"
Above set of code creates the files successfully and when I view the file everything looks fine but when I cat the file it has leading '??' as shown below:
??{
"packageName": "ABC.DEF.GHI",
"version": "0.1.0-looper-poc0529",
"oneOps": {
"platform": "XYZ",
"component": "JNL"
},
"branchName": "somebranch"
}
Due to this I am unable to parse JSON file and it gives out following error:
com.jayway.jsonpath.InvalidJsonException: com.fasterxml.jackson.core.JsonParseException: Unexpected character ('?' (code 65533 / 0xfffd)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
Upvotes: 3
Views: 2620
Reputation: 6780
A more general solution that's not specific to Out-File
is to set these before you call ConvertTo-Json
:
$OutputEncoding = [Console]::OutputEncoding = [Text.UTF8Encoding]::UTF8;
Upvotes: 0
Reputation: 415971
Those aren't ?
characters. Those are two different unprintable characters that make up a Unicode byte order mark. You see ?
because that's how the debugger, text editor, OS, or font in question renders unprintable characters.
To fix this, either change the output encoding, or use a character set on the other end that understands UTF-8. The former is a simpler fix, but the latter is probably better in the long run. Eventually you'll end up with data that needs an extended character.
Upvotes: 4
Reputation: 438813
tl;dr
It sounds like your Java code expects a UTF-8-encoded file without BOM, so direct use of the .NET Framework is needed:
[IO.File]::WriteAllText("$PWD/$packageName.json", ($packageJson | ConvertTo-Json))
As Tom Blodget points out, BOM-less UTF-8 is mandated by the IETF's JSON standard, RFC 8259.
Unfortunately, Windows PowerShell's default output encoding for Out-File
and also redirection operator >
is UTF-16LE ("Unicode"), in which:
0xff 0xfe
, the UTF-16LE encoding of Unicode character U+FEFF
the ), the so-called (BOM byte-order mark) or Unicode signature, which serves to identify the encoding.If target programs do not understand this encoding, they treat the BOM as data (and would subsequently misinterpret the actual data), which causes the problem you saw.
The specific symptom you saw - a complaint about character U+FFFD
, which is used as the generic stand-in for an invalid character in the input - suggests that your Java code likely expects UTF-8 encoding.
Unfortunately, using Out-File -Encoding utf8
is not a solution, because PowerShell invariably writes a BOM for UTF-8 as well, which Java doesn't expect.
Workarounds:
If you can be sure that the JSON string contains **only characters in the 7-bit ASCII range** (no accented characters), you can get away with Out-File -Encoding Ascii
, as TheIncorrigible1 suggests.
Otherwise, use the .NET framework directly for creating your output file with BOM-less UTF-8 encoding.
If it's an option, use the cross-platform PowerShell Core edition instead, whose default encoding is sensibly BOM-less UTF-8, for compatibility with the rest of the world.
Upvotes: 2