Reputation: 175
Just getting started with JAXB today, and I'm stuck on an odd representation of a list of data elements when there is only one value. Note that for single values of colors
it's treated more as an element instead of a list and is not wrapped in a color
tag. The data is coming from an external source and I have no control over the formatting.
How can JAXB deal with both representations of colors
?
<?xml version="1.0" encoding="utf-8"?>
<widgets>
<widget>
<name>SingleValue</name>
<colors>Blue</colors>
</widget>
<widget>
<name>ListValues</name>
<colors>
<color>Red</color>
<color>Blue</color>
</colors>
</widget>
</widgets>
I've tried various attempts with combinations of @XmlElementWrapper
and @XmlElement
, @XmlAnyElements
, @XmlElementRef(s)
, and @XmlMixed
. I've even created a colors class and tried multiple mappings to arrays and strings without luck; they would work individually but not when used concurrently.
Using the sample XML above, here is a simple program that would parse "Blue" correctly if it were wrapped in color
tags. Currently, this program is returning an empty List for colors and is unable to pick up "Blue".
@XmlRootElement(name = "widgets")
@XmlAccessorOrder(XmlAccessOrder.UNDEFINED)
public class Widgets {
private List<Widget> widgets = new ArrayList<Widget>();
public static void main(String[] args ) {
File f = new File("C:\\aersmine\\AERS_KDR_Data", "widgets.xml");
try {
Widgets widgets = Widgets.load( f );
for ( Widget widget : widgets.widgets ) {
StringBuilder sb = new StringBuilder();
for ( String color : widget.getColors() ) {
if ( sb.length() > 0 )
sb.append( ", " );
sb.append(color);
}
System.out.println( "Widget " + widget.getName() + " Colors: " + sb.toString());
}
}
catch ( Exception e ) {
e.printStackTrace();
}
}
public static Widgets load(File file)
throws JAXBException, IOException {
FileInputStream is = new FileInputStream(file);
try {
JAXBContext ctx = JAXBContext.newInstance(Widgets.class);
Unmarshaller u = ctx.createUnmarshaller();
return (Widgets) u.unmarshal(is);
}
finally {
is.close();
}
}
@XmlElement(name="widget")
public List<Widget> getWidgets() {
return widgets;
}
public void setWidgets( List<Widget> widgets ) {
this.widgets = widgets;
}
}
public class Widget {
public String n;
public List<String> cl = new ArrayList<String>();
@XmlElement(name="name")
public String getName() {
return n;
}
public void setName( String name ) {
this.n = name;
}
@XmlElementWrapper(name="colors")
@XmlElement(name="color")
public List<String> getColors() {
return cl;
}
public void setColors( List<String> colors ) {
this.cl = colors;
}
}
Upvotes: 1
Views: 861
Reputation: 3530
I was able to find the workaround by adding another field private String color;
to Widget
class. With this if there is a list then it will populate private List<String> colors
and if there is just colors
with value then it will populate private String color;
.
Widgets.class:
@Data
@NoArgsConstructor
@XmlAccessorType(XmlAccessType.FIELD)
@XmlRootElement(name = "widgets")
public class Widgets {
@XmlElement(name="widget")
private List<Widget> widgets = new ArrayList<Widget>();
}
Widget.class:
@XmlAccessorType(XmlAccessType.NONE)
@Data
public class Widget {
@XmlElement(name = "name")
private String name;
@XmlElementWrapper(name = "colors")
@XmlElement(name = "color")
private List<String> colors = new ArrayList<>();
@XmlElement(name = "colors")
private String color = null;
//If the invalid Color needs to be converted to Proper XML with Colors list then add this method
public void afterUnmarshal(Unmarshaller m, Object parent) {
if (color.matches(".*[a-zA-Z]+.*")) {
colors.add(color);
}
color = null;
}
}
JaxbExampleMain.class:
public class JaxbExampleMain {
public static void main(String[] args) throws JAXBException, XMLStreamException {
final InputStream inputStream = Unmarshalling.class.getClassLoader().getResourceAsStream("colors.xml");
final XMLStreamReader xmlStreamReader = XMLInputFactory.newInstance().createXMLStreamReader(inputStream);
final Unmarshaller unmarshaller = JAXBContext.newInstance(Widgets.class).createUnmarshaller();
final Widgets widgets = unmarshaller.unmarshal(xmlStreamReader, Widgets.class).getValue();
System.out.println(widgets.toString());
Marshaller marshaller = JAXBContext.newInstance(Widgets.class).createMarshaller();
marshaller.setProperty(Marshaller.JAXB_FRAGMENT, Boolean.TRUE);
marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
marshaller.marshal(widgets, System.out);
}
}
This will produce following output when you try to unmarshall and marshal the XML provided in the question:
Widgets(widgets=[Widget(name=SingleValue, colors=[], color=Blue), Widget(name=ListValues, colors=[Red, Blue], color=
)])
<widgets>
<widget>
<name>SingleValue</name>
<colors>Blue</colors>
</widget>
<widget>
<name>ListValues</name>
<colors>
<color>Red</color>
<color>Blue</color>
</colors>
</widget>
</widgets>
Upvotes: 1
Reputation: 175
First and foremost it is important for me to state that this is NOT the answer I'm looking for, but it is a temporary/alternative solution until a JAXB solution has been found. I'm currently forced to use this solution until a JAXB solution can be found.
I'm providing this alternative solution since others may find it useful since it provides the ability to use a regular expression pattern to manipulate the stream and correct the underlying problem that is preventing the original XML from being parsed correctly. This is accomplished by the use of a FilterReader.
As a simple recap, the XML data contains a list of colors wrapped by colors
. Each color is tagged with color
as expected within the list. The problem is when there is a single color value; that value is not wrapped in color
and therefore it is not parsable.
Example of a good list of colors:
<colors>
<color>Red</color>
<color>Blue</color>
</colors>
Example of a bad single colors:
<colors>Blue</colors>
This solution will use a regular expression pattern, <colors>([^<>]+?)\s*<\/colors>
, to identify the incorrect XML list. Then it will use a replacement string value, <color>|</color>
, to apply a prefix and suffix to the found group(1)
object splitting on the pipe character.
The corrected results for the bad single color will then become as follows so the JAXB unmarshalling will pull it in:
<colors><color>Blue</color></colors>
Implementation:
Using the code above in the original request, replace the public static Widgets load
function with this one. Notice that besides the addition of the new WidgetFilterReader
, the other significant change in this version of the loader is the use of a FileReader
.
public static Widgets load(File file)
throws JAXBException, IOException
{
Reader reader =
new WidgetFilterReader(
"<colors>([^<>]+?)\\s*<\\/colors>", "<color>|</color>",
new FileReader(file));
try
{
JAXBContext ctx = JAXBContext.newInstance(Widgets.class);
Unmarshaller u = ctx.createUnmarshaller();
return (Widgets) u.unmarshal(reader);
}
finally
{
reader.close();
}
}
Then add this class which is the FilterReader implementation:
public class WidgetFilterReader
extends FilterReader
{
private StringBuilder sb = new StringBuilder();
@SuppressWarnings( "unused" )
private final String search;
private final String replace;
private Pattern pattern;
private static final String EOF = "\uFFEE"; // half-width white circle - Used as to place holder and token
/**
*
* @param search A regular expression to build the pattern. Example: "<colors>([^<>]+?)\\s*<\\/colors>"
* @param replace A String value with up to two parts to prefix and suffix the found group(1) object, separated by a pipe: ie |.
* Example: "<color>*</color>"
* @param in
*/
protected WidgetFilterReader( String search, String replace, Reader in ) {
super( in );
this.search = search;
this.replace = replace;
this.pattern = Pattern.compile(search);
}
@Override
public int read()
throws IOException {
int read = ingest();
return read;
}
private int ingest() throws IOException
{
if (sb.length() == 0) {
int c = super.read();
if ( c < 0 )
return c;
sb.append( (char) c );
}
if ( sb.length() > 0 && sb.charAt(0) == '<' ) {
int count = 0;
for ( int i = 0; i < sb.length(); i++ ) {
if ( sb.charAt( i ) == '>' )
count++;
}
int c2;
while ((c2 = super.read()) >= 0 && count < 2) {
sb.append( (char) c2 );
if (c2 == '>')
count++;
}
if ( c2 < 0 )
sb.append( EOF );
else
sb.append( (char) c2 );
Matcher m = pattern.matcher( sb.toString() );
if ( m.find(0) ) {
String grp = m.group(1);
int i = sb.indexOf(grp);
if ( i >= 0 ) {
int j = i + grp.length();
String[] r = replace.split( "\\|" );
sb.replace(i, j, (r.length > 0 ? r[0] : "") + grp + (r.length > 1 ? r[1] : ""));
}
}
}
int x = sb.charAt(0);
sb.deleteCharAt(0);
if ( x == EOF.charAt(0) )
return -1;
return x;
}
@Override
public int read( char[] cbuf, int off, int len )
throws IOException {
int c;
int read = 0;
while (read < len && (c = ingest()) >= 0 ) {
cbuf[off + read] = (char) c;
read++;
}
if (read == 0)
read = -1;
return read;
}
}
Overview on how this works:
Basically this class is using a StringBuilder as a buffer while it reads ahead searching for the supplied pattern. When the pattern is found in the StringBuilder buffer, then the StringBuilder is modified to contain the corrected data. This works since the stream is always read and added to the internal buffer and then pulled from that buffer as it is consumed up stream. This ensures that the pattern can be found with only loading the minimal amount of characters prior to upstream consumption of those characters.
Since the EndOfFile can be encountered while searching for the pattern, there needs to be a token to insert in to the buffer so the correct EOF can be returned as the upstream consumers reach that point. Hence the use of a rather obscure unicode character that is used for EOF token. IF that may happen to be in your source data, then something else should be used instead.
I should also note, that although the regular expression pattern is being passed in to this FilterReader, the code that prefetchs enough data to perform a valid search for the target data, is keyed upon the specific attribute of the pattern that is being used. It ensures that prior to attempting a find(0)
, that enough data has been loaded in to the StringBuilder buffer. This is accomplished by checking for a beginning character of <
then ensuring that two more >
characters are loaded to satisfy the minimal needs for the given pattern. What does that mean? If you are trying to reuse this code for another purpose, you may have to modify the prefetcher to ensure you get enough data in memory for the pattern matcher to use successfully.
Upvotes: 0