Reputation: 3147
I am trying to unmarshal multiple items contained in nodes with an identical structure for further processing, but don't seem to be able to access the data and I am not sure why. The XML data is structured in the following form (I am trying to access all of the Item
's:
<?xml version="1.0" encoding="ISO-8859-1" ?>
<datainfo>
<origin>NOAA/NOS/CO-OPS</origin>
<producttype> Annual Tide Prediction </producttype>
<IntervalType>High/Low Tide Predictions</IntervalType>
<data>
<item>
<date>2015/12/31</date>
<day>Thu</day>
<time>03:21 AM</time>
<predictions_in_ft>5.3</predictions_in_ft>
<predictions_in_cm>162</predictions_in_cm>
<highlow>H</highlow>
</item>
<item>
<date>2015/12/31</date>
<day>Thu</day>
<time>09:24 AM</time>
<predictions_in_ft>2.4</predictions_in_ft>
<predictions_in_cm>73</predictions_in_cm>
<highlow>L</highlow>
</item>
</data>
</datainfo>
My code is:
package main
import (
"encoding/xml"
"fmt"
"io/ioutil"
"os"
)
// TideData stores a series of tide predictions
type TideData struct {
Tides []Tide `xml:"data>item"`
}
// Tide stores a single tide prediction
type Tide struct {
Date string `xml:"date"`
Day string `xml:"day"`
Time string `xml:"time"`
PredictionFt float64 `xml:"predictions_in_ft"`
PredictionCm float64 `xml:"predictions_in_cm"`
HighLow string `xml:"highlow"`
}
func (t Tide) String() string {
return t.Date + " " + t.Day + " " + t.Time + " " + t.HighLow
}
func main() {
xmlFile, err := os.Open("9414275 Annual.xml")
if err != nil {
fmt.Println("Error opening file:", err)
return
}
defer xmlFile.Close()
b, _ := ioutil.ReadAll(xmlFile)
var tides TideData
xml.Unmarshal(b, &tides)
fmt.Println(tides)
for _, datum := range tides.Tides {
fmt.Printf("\t%s\n", datum)
}
}
When run the output is empty, which leads me to think that the data is not unmarshalled. Output is:
{[]}
Upvotes: 1
Views: 989
Reputation: 43899
You are ignoring the error return from xml.Unmarshal
. By slightly modifying your program, we can see what is going on:
xml: encoding "ISO-8859-1" declared but Decoder.CharsetReader is nil
And poking around in the documentation, we find that by default the package only supports XML encoded in UTF-8:
// CharsetReader, if non-nil, defines a function to generate
// charset-conversion readers, converting from the provided
// non-UTF-8 charset into UTF-8. If CharsetReader is nil or
// returns an error, parsing stops with an error. One of the
// the CharsetReader's result values must be non-nil.
CharsetReader func(charset string, input io.Reader) (io.Reader, error)
So it seems you need to provide your own character set conversion routine. You can inject it by modifying your code something like this:
decoder := xml.NewDecoder(xmlFile)
decoder.CharsetReader = makeCharsetReader
err := decoder.Decode(&tides)
(note that we're now decoding from an io.Reader
rather than a byte array now, so the ReadAll
logic can be removed). The golang.org/x/text/encoding
family of packages might help you in implementing your makeCharsetReader
function. Something like this might work:
import "golang.org/x/text/encoding/charmap"
func makeCharsetReader(charset string, input io.Reader) (io.Reader, error) {
if charset == "ISO-8859-1" {
// Windows-1252 is a superset of ISO-8859-1, so should do here
return charmap.Windows1252.NewDecoder().Reader(input), nil
}
return nil, fmt.Errorf("Unknown charset: %s", charset)
}
You should then be able to decode the XML.
Upvotes: 6