Deukalion
Deukalion

Reputation: 2665

Alternative ways to load / save data - without serialization?

Ok. I know how to use Serialization and such, but since that only applies to Objects that's been marked with Serialization attribute - how can I for example load data and use it in an application without using Serialization? Say a data file.

Or, create a datacontainer with serialization that holds files not serialized.

Methods I've used is Binary Serialization and XML Serialization. Any other ways that can load unknown data and perhaps somehow use it in C#?

Upvotes: 2

Views: 4454

Answers (3)

KeithS
KeithS

Reputation: 71573

Maybe a definition of terms is in order; serialization is "the process of converting a data structure or object state into a format that can be stored and "resurrected" later in the same or another computer environment". Pretty much any method of converting "volatile" memory into persistent data and back is "serialization", so even if you roll your own scheme to do it, you're "serializing".

That said, it sounds like you simply don't want to use .NET binary serialization. That's actually the right idea; binary serialization is simple, but very code- and environment-dependent. Moving a serializable class to a different namespace, or serializing a file using the Microsoft CLR and then trying to deserialize it in Mono, can break binary serialization.

First and foremost, you MUST be able to determine what type of object you should try to create based on the file. You simply cannot open some "random" file and expect to be able to get anything meaningful out of it without knowing how the data is structured within the file. The easiest way is for the file to tell you, by specifying the type name of the object it was created from (which you will hopefully have available in your codebase). Most built-in serializers do it this way. Other ways the file can inform consumers of its format include file, row and/or field header codes (very common in older standards as they economize on file size) and extension/MIME type.

With that sorted out, deserialization can take place. If the file was serialized using a built-in serializer, simply use that, but if it's an older format (CSV, fixed-length) then you will have to parse the file, line by line, into objects representing lines, collected within a main object representing the file.

Have a look at the ETL (Extract-Transform-Load) process pattern. This is a modular, scaleable architecture pattern for taking files and turning them into data the program can work with:

  • Extract - This part of the system is pointed at the filesystem, or other incoming "pipe" for raw data, and its job is to open the file, extract the data into a very basic object format that can be further manipulated, and put those objects into an in-memory "queue" for the Transform step. The goal is to get data from the pipe as fast and efficiently as possible, but you are required at this point to have some knowledge of the data you are working with so that you can effectively encapsulate it for further processing; actually turning the data into the format you really want happens later.
  • Transform - This part of the system takes the extracted data, and performs the logic that will put that data into a hydrated object from your codebase. This is where, given information from the Extract step about the type of file the data was extracted from, you instantiate a domain object that represents the data model, slice the raw data up into the chunks that will be stored as data members, perform any type conversions (data you get from a file is usually either in string format or in raw bits and must be marshalled or otherwise converted into data types that better represent the concept of the data), and validate that the internal structure of the new object is consistent and meets known business rules. Hydrated, valid objects are placed in an output queue to be processed by the Load step.
  • Load - This step takes the hydrated, valid business objects from the Transform step and persists them into the data store that is used by your system (such as a SQL database or the program's native flat file format).

Upvotes: 3

Asti
Asti

Reputation: 12687

Well, the old fashioned way was to use stream access operations and read out the data you wanted. This way you could read/write to pretty much any file. Serialization simply automates this process based on some contract.

Based on your comment, I'm guessing that your requirement is to read any kind of file without having a contract in the first place.

Let's say you have a raw file with the first byte specifying the length of a string and the next set of bytes representing the string;

For example, 5 | H | e | l | l | o

var stream = File.Open(filename);
var length = stream.ReadByte();
byte[] b = new byte[length];

stream.Read(b, 0, length);

var string = Encoding.ASCII.GetString(b);

Binary I/O is as raw as it gets. Check MSDN for more.

Upvotes: 0

Matías Fidemraizer
Matías Fidemraizer

Reputation: 64943

JSON serialization using JSON.NET

This eats everything! Including anonymous types.

Edit

I know you said "you don't want serialization", but based on your statement "[...]Objects that's been marked with Serialization attribute", I believe you didn't try JSON serialization using JSON.NET!

Upvotes: 3

Related Questions