lofcek
lofcek

Reputation: 1233

Parsing plist xml

How to parse xml in such silly format:

<key>KEY1</key><string>VALUE OF KEY1</string>
<key>KEY2</key><string>VALUE OF KEY2</string>
<key>KEY3</key><integer>42</integer>
<key>KEY3</key><array>
    <integer>1</integer>
    <integer>2</integer>
</array>

Parsing would be very simple if all values would have same type - for example strings. But in my case each value could be string, data, integer, boolean, array or dict.

This xml looks nearly like json, but unfortunately format is fixed, and I cannot change it. And I would prefer solution without any external packages.

Upvotes: 0

Views: 1797

Answers (2)

jxstanford
jxstanford

Reputation: 3387

Since the data is not well structured, and you can't modify the format, you can't use xml.Unmarshal, so you can process the XML elements by creating a new Decoder, then iterate over the tokens and use DecodeElement to process them one by one. In my sample code below, it puts everything in a map. The code is also on github here...

package main

import (
        "encoding/xml"
    "strings"
    "fmt"
)

type PlistArray struct {
    Integer []int  `xml:"integer"`
}

const in = "<key>KEY1</key><string>VALUE OF KEY1</string><key>KEY2</key><string>VALUE OF KEY2</string><key>KEY3</key><integer>42</integer><key>KEY3</key><array><integer>1</integer><integer>2</integer></array>"

func main() {
    result := map[string]interface{}{}
    dec := xml.NewDecoder(strings.NewReader(in))
    dec.Strict = false
    var workingKey string

    for {
        token, _ := dec.Token()
        if token == nil {
            break
        }
        switch start := token.(type) {
        case xml.StartElement:
            fmt.Printf("startElement = %+v\n", start)
            switch start.Name.Local {
            case "key":
                var k string
                err := dec.DecodeElement(&k, &start)
                if err != nil {
                    fmt.Println(err.Error())
                }
                workingKey = k
            case "string":
                var s string
                err := dec.DecodeElement(&s, &start)
                if err != nil {
                    fmt.Println(err.Error())
                }
                result[workingKey] = s
                workingKey = ""
            case "integer":
                var i int
                err := dec.DecodeElement(&i, &start)
                if err != nil {
                    fmt.Println(err.Error())
                }
                result[workingKey] = i
                workingKey = ""
            case "array":
                var ai PlistArray
                err := dec.DecodeElement(&ai, &start)
                if err != nil {
                    fmt.Println(err.Error())
                }
                result[workingKey] = ai
                workingKey = ""
            default:
                fmt.Errorf("Unrecognized token")
            }
        }
    }
    fmt.Printf("%+v", result)

}

Upvotes: 0

kostix
kostix

Reputation: 55553

Use a lower-level parsing interface provided by encoding/xml which allows you to iterate over individual tokens in the XML stream (such as "start element", "end element" etc).

See the Token() method of the encoding/xml's Decoder type.

Upvotes: 1

Related Questions