Reputation: 14787
I have a client/server application where data is exchanged in XML format. The size of data comes to around 50MB, most of which comprises of the XML tags themselves. Is there a way to take the generated XML and index the node names as follows:
<User><Assessments><Assessment ID="1" Name="some name" /></Assessments></User>
to:
<A><B><C ID="1" Name="some name" /></B></A>
This would save an incredible amount of bloat.
EDIT
This data is serialized from Entity Framework objects. The reason for choosing XML as the protocol was intrinsic support in .NET and smart code generation of FromXml and ToXml for entities to circumvent circular references.
Upvotes: 0
Views: 3828
Reputation: 14787
I ended up writing a small class that renames the node names and creates a mapping element so the process can be reversed as well. That alone took the file size down from 50MB to 10MB.
Compressing the file would be the next step but I wonder how much space I could ave using Binary serialization. Have not tried that before.
Upvotes: 0
Reputation: 13150
Alternatively, you can also consider json
instead of xml
, which would take less size as compared to xml
Upvotes: 0
Reputation: 9752
The point of XML is so that you don't need to compress/minimise the data. If you need to minimise what's going down the wire then there's a good chance your using the wrong protocol.
Obviously you can pass this through a gzip stream, which will get you a massive advantage, but if you want to squeeze even more out of it than that then it may be worth looking at JSON or even a binary format.
XML was designed to be readable by humans, and by removing the readability then your essentially removing one of the major reasons to use XML in the first place.
Upvotes: 1
Reputation: 137128
You could look at using Attributes for your data rather than Elements. For example, if you have "gender" as an attribute you will get:
<person gender="female">
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
whereas if it is an Element you will get:
<person>
<gender>female</gender>
<firstname>Anna</firstname>
<lastname>Smith</lastname>
</person>
It's not strictly correct but will achieve what you are after.
Upvotes: 1
Reputation: 31192
What about just compressing/decompressing your data stream between the client and the server ? This will be easier to implement and much less error prone than to do some custom transformation on the xml data.
Upvotes: 4