How to stop json.Marshal from escaping < and >?

package main

import "fmt"
import "encoding/json"

type Track struct {
    XmlRequest string `json:"xmlRequest"`
}

func main() {
    message := new(Track)
    message.XmlRequest = "<car><mirror>XML</mirror></car>"
    fmt.Println("Before Marshal", message)
    messageJSON, _ := json.Marshal(message)
    fmt.Println("After marshal", string(messageJSON))
}

Is it possible to make json.Marshal not escape < and >? I currently get:

{"xmlRequest":"\u003ccar\u003e\u003cmirror\u003eXML\u003c/mirror\u003e\u003c/car\u003e"}

but I am looking for something like this:

{"xmlRequest":"<car><mirror>XML</mirror></car>"}

Upvotes: 64

Views: 48011

Answers (7)

galamdring
galamdring

Reputation: 313

I had a requirement to store xml inside json :puke:

At first I was having significant difficulty unmarshalling that xml after passing it via json, but my issue was actually due to trying to unmarshall the xml string as a json.RawMessage. I actually needed to unmarshall it as a string and then coerce it into []byte for the xml.Unmarshal.

type xmlInJson struct {
  Data string `json:"data"`
}

var response xmlInJson
err := json.Unmarshall(xmlJsonData, &response)
var xmlData someOtherStructThatMatchesTheXmlFormat
err = xml.Unmarshall([]byte(response.Data), &xmlData) 

Upvotes: 2

Zamicol
Zamicol

Reputation: 5024

Here's my workaround:

// Marshal is a UTF-8 friendly marshaler.  Go's json.Marshal is not UTF-8
// friendly because it replaces the valid UTF-8 and JSON characters "&". "<",
// ">" with the "slash u" unicode escaped forms (e.g. \u0026).  It preemptively
// escapes for HTML friendliness.  Where text may include any of these
// characters, json.Marshal should not be used. Playground of Go breaking a
// title: https://play.golang.org/p/o2hiX0c62oN
func Marshal(i interface{}) ([]byte, error) {
    buffer := &bytes.Buffer{}
    encoder := json.NewEncoder(buffer)
    encoder.SetEscapeHTML(false)
    err := encoder.Encode(i)
    return bytes.TrimRight(buffer.Bytes(), "\n"), err
}

Upvotes: 8

dave
dave

Reputation: 64657

As of Go 1.7, you still cannot do this with json.Marshal(). The source code for json.Marshal shows:

> err := e.marshal(v, encOpts{escapeHTML: true})

The reason json.Marshal always does this is:

String values encode as JSON strings coerced to valid UTF-8, replacing invalid bytes with the Unicode replacement rune. The angle brackets "<" and ">" are escaped to "\u003c" and "\u003e" to keep some browsers from misinterpreting JSON output as HTML. Ampersand "&" is also escaped to "\u0026" for the same reason.

This means you cannot even do it by writing a custom func (t *Track) MarshalJSON(), you have to use something that does not satisfy the json.Marshaler interface.

So, the workaround, is to write your own function:

func (t *Track) JSON() ([]byte, error) {
    buffer := &bytes.Buffer{}
    encoder := json.NewEncoder(buffer)
    encoder.SetEscapeHTML(false)
    err := encoder.Encode(t)
    return buffer.Bytes(), err
}

https://play.golang.org/p/FAH-XS-QMC

If you want a generic solution for any struct, you could do:

func JSONMarshal(t interface{}) ([]byte, error) {
    buffer := &bytes.Buffer{}
    encoder := json.NewEncoder(buffer)
    encoder.SetEscapeHTML(false)
    err := encoder.Encode(t)
    return buffer.Bytes(), err
}

https://play.golang.org/p/bdqv3TUGr3

Upvotes: 87

Coconut
Coconut

Reputation: 2222

This doesn't answer the question directly but it could be an answer if you're looking for a way how to deal with json.Marshal escaping < and >...

Another way to solve the problem is to replace those escaped characters in json.RawMessage into just valid UTF-8 characters, after the json.Marshal() call.

It will work as well for any letters other than < and >. (I used to do this to make non-English letters to be human readable in JSON :D)

func _UnescapeUnicodeCharactersInJSON(_jsonRaw json.RawMessage) (json.RawMessage, error) {
    str, err := strconv.Unquote(strings.Replace(strconv.Quote(string(_jsonRaw)), `\\u`, `\u`, -1))
    if err != nil {
        return nil, err
    }
    return []byte(str), nil
}

func main() {
    // Both are valid JSON.
    var jsonRawEscaped json.RawMessage   // json raw with escaped unicode chars
    var jsonRawUnescaped json.RawMessage // json raw with unescaped unicode chars

    // '\u263a' == '☺'
    jsonRawEscaped = []byte(`{"HelloWorld": "\uC548\uB155, \uC138\uC0C1(\u4E16\u4E0A). \u263a"}`) // "\\u263a"
    jsonRawUnescaped, _ = _UnescapeUnicodeCharactersInJSON(jsonRawEscaped)                        // "☺"

    fmt.Println(string(jsonRawEscaped))   // {"HelloWorld": "\uC548\uB155, \uC138\uC0C1(\u4E16\u4E0A). \u263a"}
    fmt.Println(string(jsonRawUnescaped)) // {"HelloWorld": "안녕, 세상(世上). ☺"}
}

https://play.golang.org/p/pUsrzrrcDG-

I hope this helps someone.

Upvotes: 10

rupert.kim
rupert.kim

Reputation: 27

Custom function is not kind of the best solution.

How about another library to solve this. I use gabs

import

go get "github.com/Jeffail/gabs"

use

message := new(Track)
resultJson,_:=gabs.Consume(message)

fmt.Println(string(resultJson.EncodeJSON()))

I solve that problem like this.

Upvotes: -2

Codefor
Codefor

Reputation: 1366

No, you can't.

A third-party json package might be the choice rather than the std json lib.

More detail:https://github.com/golang/go/issues/8592

Upvotes: 1

adrianlzt
adrianlzt

Reputation: 2031

In Go1.7 the have added a new option to fix this:

encoding/json: add Encoder.DisableHTMLEscaping This provides a way to disable the escaping of <, >, and & in JSON strings.

The relevant function is

func (*Encoder) SetEscapeHTML

That should be applied to a Encoder.

enc := json.NewEncoder(os.Stdout)
enc.SetEscapeHTML(false)

Simple example: https://play.golang.org/p/SJM3KLkYW-

Upvotes: 41

Related Questions