Reputation: 101
What are the pros and cons of putting CSV formatted data inside an xml element?
I must serialize objects in java with a matrix data field to xml. I abandoned the idea of using data binding with Jaxb; generics and collections containing collections are too much of a pain to deal with.
I thought of a simple schema for my matrix, but since I will also have to implement serialization of matrices to CSV, why not just dump a CSV string as a text node in an element? It would also make files a bit smaller.
Can you think of arguments against this idea?
Should I add something like a csv mimetype to this element?
EDIT: Here's the solution I opted for. It uses Super-CSV. The enum is needed because the generic type is erased at runtime. The main xml file will reference the csv files.
static public enum SerializableType{INTEGER,DOUBLE,...};
@SuppressWarnings("unchecked")
public static <T> Matrix<T> fromCSV(InputStream in, CsvPreference pref, SerializableType t)
{
Matrix<T> o = new Matrix<T>();
// Super-csv class
CsvListReader csv_reader = new CsvListReader(new InputStreamReader(in), pref);
Integer n = null;
try {
List<String> l = csv_reader.read();
n = l.size(); o.n = n;
int i=0;
while(l!=null)
{
o.appendRow();
T val;
for(int j=0;j<n;j++)
{
switch(t)
{
case INTEGER:
val = (T)Integer.valueOf(Integer.parseInt(l.get(j)));
break;
case DOUBLE:
val = (T)Double.valueOf(Double.parseDouble(l.get(j)));
break;
case <...>
default:
throw new IllegalArgumentException();
}
o.set(i,j, val);
}
i++;
l = csv_reader.read();
}
csv_reader.close();
} catch (IOException e) {
e.printStackTrace();
}
return o;
}
public static<T> void toCSV(Matrix<T> m, CsvListWriter csv_writer, SerializableType t)
{
try {
for(int i=0;i<m.rowCount();i++)
{
ArrayList<String> l = new ArrayList<String>();
for(int j=0;j<m.columnCount();j++)
{
if(m.get(i,j)==null)
{
l.add(null);
}
else{
switch(t)
{
case INTEGER:
l.add(Integer.toString((Integer)m.get(i,j)));
break;
case DOUBLE:
l.add(Double.toString((Double)m.get(i,j)));
break;
case
<...>
default:
throw new IllegalArgumentException();
};
}
}
csv_writer.write(l);
}
csv_writer.flush();
csv_writer.close();
} catch (IOException e) {
e.printStackTrace();
}
}
Upvotes: 2
Views: 1205
Reputation: 149037
XML schema allows you to define a collection type where the items are separated by a space.
<xs:list itemType="xs:int"/>
Below is a full example of how you could leverage this in JAXB to represent a matrix.
Java Model (Root)
We will use a 2 dimensional int array to represent out matrix. We will use an XmlAdapter
to get a non-default array representation (see: JAXB & java.util.Map)
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.adapters.XmlJavaTypeAdapter;
@XmlRootElement
public class Root {
private int[][] matrix;
@XmlJavaTypeAdapter(MatrixAdapter.class)
public int[][] getMatrix() {
return matrix;
}
public void setMatrix(int[][] matrix) {
this.matrix = matrix;
}
}
XmlAdapter (MatrixAdapter)
When you annotate int[]
with @XmlValue
the XML representation will be space separated text.
import java.util.*;
import javax.xml.bind.annotation.*;
import javax.xml.bind.annotation.adapters.XmlAdapter;
public class MatrixAdapter extends XmlAdapter<MatrixAdapter.AdaptedMatrix, int[][]>{
public static class AdaptedMatrix {
@XmlElement(name="row")
public List<AdaptedRow> rows;
}
public static class AdaptedRow {
@XmlValue
public int[] row;
}
@Override
public AdaptedMatrix marshal(int[][] matrix) throws Exception {
AdaptedMatrix adaptedMatrix = new AdaptedMatrix();
adaptedMatrix.rows = new ArrayList<AdaptedRow>(matrix.length);
for(int[] row : matrix) {
AdaptedRow adaptedRow = new AdaptedRow();
adaptedRow.row = row;
adaptedMatrix.rows.add(adaptedRow);
}
return adaptedMatrix;
}
@Override
public int[][] unmarshal(AdaptedMatrix adaptedMatrix) throws Exception {
List<AdaptedRow> adaptedRows = adaptedMatrix.rows;
int[][] matrix = new int[adaptedRows.size()][];
for(int x=0; x<adaptedRows.size(); x++) {
matrix[x] = adaptedRows.get(x).row;
}
return matrix;
}
}
Demo Code
Below is some demo code you can run to prove that everything works:
import java.io.File;
import javax.xml.bind.*;
public class Demo {
public static void main(String[] args) throws Exception {
JAXBContext jc = JAXBContext.newInstance(Root.class);
Unmarshaller unmarshaller = jc.createUnmarshaller();
File xml = new File("src/forum17119708/input.xml");
Root root = (Root) unmarshaller.unmarshal(xml);
Marshaller marshaller = jc.createMarshaller();
marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
marshaller.marshal(root, System.out);
}
}
input.xml/Output
<?xml version="1.0" encoding="UTF-8"?>
<root>
<matrix>
<row>1 2 3 4</row>
<row>5 6 7 8</row>
</matrix>
</root>
input.xml/Output
<?xml version="1.0" encoding="UTF-8"?>
<root>
<matrix>
<row>1 2 3</row>
<row>4 5 6</row>
<row>7 8 9</row>
</matrix>
</root>
Upvotes: 2
Reputation: 13374
XML is a good format to structure some kind of information but a pain for others like matrix, beyond the technical limitations of the XML libraries, because you don't want to clutter your clean tabular representation with all these horrible angle brackets everywhere and you often want quick parsing based on split.
In this case you should avoid the "if all you have is a hammer, everything looks like a nail" syndrome, you need another representation that can naturally handle tabular data: CSV.
So your idea of combining the strengths of both format is the right idea: XML for data that need structuring, CSV for tabular data.
As for the MIME type if only your application will be dealing with the file you really don't need to specify it but adding one really don't cost a lot; but I don't know if any standard attributes exists, except maybe something like "xsi:type="CSV"".
PS: I've written about the aforementioned syndrome in a different context: http://pragmateek.com/if-all-you-have-is-a-hammer/ :)
Upvotes: 1