Reputation: 5003
I have a list of objects that I would like to store in a file as small as possible for later retrieval. I have been carefully reading this tutorial, and am beginning (I think) to understand, but have several questions. Here is the snippet I am working with:
static bool writeHistory(string fileName)
{
fstream historyFile;
historyFile.open(fileName.c_str(), ios::binary);
if (historyFile.good())
{
list<Referral>::iterator i;
for(i = AllReferrals.begin();
i != AllReferrals.end();
i++)
{
historyFile.write((char*)&(*i),sizeof(Referral));
}
return true;
} else return false;
}
Now, this is adapted from the snippet
file.write((char*)&object,sizeof(className));
taken from the tutorial. Now what I believe it is doing is converting the object to a pointer, taking the value and size and writing that to the file. But if it is doing this, why bother doing the conversions at all? Why not take the value from the beginning? And why does it need the size? Furthermore, from my understanding then, why does
historyFile.write((char*)i,sizeof(Referral));
not compile? i is an iterator (and isn't an iterator a pointer?). or simply
historyFile.write(i,sizeof(Referral));
Why do i need to be messing around with addresses anyway? Aren't I storing the data in the file? If the addresses/values are persisting on their own, why can't i just store the addresses deliminated in plain text and than take their values later?
And should I still be using the .txt extension? < edit> what should I use instead then? I tried .dtb and was not able to create the file. < /edit> I actually can't even seem to get file to open without errors with the ios::binary flag. I'm also having trouble passing the filename (as a string class string, converted back by c_str(), it compiles but gives an error).
Sorry for so many little questions, but it all basically sums up to how to efficiently store objects in a file?
Upvotes: 3
Views: 8935
Reputation: 8171
isn't an iterator a pointer?
An iterator is something that acts like a pointer from the outside. In most (perhaps all) cases, it is actually some form of object instead of a bare pointer. An iterator might contain a pointer as an internal member variable that it uses to perform its job, but it just as well might contain something else or additional variables if necessary.
Furthermore, even if an iterator has a simple pointer inside of it, it might not point directly at the object you're interested in. It might point to some kind of bookkeeping component used by the container class which it can then use to get the actual object of interest. Fortunately, we don't need to care what those internal details actually are.
So with that in mind, here's what's going on in (char*)&(*i)
.
*i
returns a reference to the object stored in the list.&
takes the address of that object, thus yielding a pointer to the object.(char*)
casts that object pointer into a char pointer.That snippet of code would be the short form of doing something like this:
Referral& r = *i;
Referral* pr = &r;
char* pc = (char*)pr;
Why do i need to be messing around with addresses anyway?
And why does it need the size?
fstream::write
is designed to write a series of bytes to a file. It doesn't know anything about what those bytes mean. You give it an address so that it can write the bytes that exist starting wherever that address points to. You give it a size so that it knows how many bytes to write.
So if I do:
MyClass ExampleObject;
file.write((char*)ExampleObject, sizeof(ExampleObject));
Then it writes all the bytes that exist directly within ExampleObject
to the file.
Note: As others have mentioned, if the object you want to write has members that dynamically allocate memory or otherwise make use of pointers, then the pointed to memory will not be written by a single simple fstream::write
call.
will serialization give a significant boost in storage efficiency?
In theory, binary data can often be both smaller than plain-text and faster to read and write. In practice, unless you're dealing with very large amounts of data, you'll probably never notice the difference. Hard drives are large and processors are fast these days.
And efficiency isn't the only thing to consider:
Of course, a good serialization library can help out with some of these issues. And so could a good plain-text format library (for instance, a library for XML). If you're still learning, then I'd suggest trying out both ways to get a feel for how they work and what might do best for your purposes.
Upvotes: 2
Reputation: 19928
Did you have already a look at boost::serialization, it is robust, has a good documentation, supports versioning and if you want to switch to an XML format instead of a binary one, it'll be easier.
Upvotes: 1
Reputation: 17007
What you are trying to do is called serialization. Boost has a very good library for doing this.
What you are trying to do can work, in some cases, with some very important conditions. It will only work for POD types. It is only guaranteed to work for code compiled with the same version of the compiler, and with the same arguments.
(char*)&(*i)
says to take the iterator i, dereference it to get your object, take the address of it and treat it as an array of characters. This is the start of what is being written to the file. sizeof(Referral)
is the number of bytes that will be written out.
An no, an iterator is not necessarily a pointer, although pointers meet all the requirements for an iterator.
Upvotes: 7
Reputation: 8972
Question #1 why does ... not compile? Answer: Because i is not a Referral* -- it's a list::iterator ;; an iterator is an abstraction over a pointer, but it's not a pointer.
Question #2 should I still be using the .txt extension? Answer: probably not. .txt is associated by many systems to the MIME type text/plain.
Unasked Question: does this work? Answer: if a Referral has any pointers on it, NO. When you try to read the Referrals from the file, the pointers will be pointing to the location on memory where something used to live, but there is no guarantee that there is anything valid there anymore, least of all the thing that the pointers were pointing to originally. Be careful.
Upvotes: 2
Reputation: 1635
What you are trying to do (reading and writing raw memory to/from file) will invoke undefined behaviour, will break for anything that isn't a plain-old-data type, and the files that are generated will be platform dependent, compiler dependent and probably even dependent on compiler settings.
C++ doesn't have any built-in way of serializing complex data. However, there are libraries that you might find useful. For example:
http://www.boost.org/doc/libs/1_40_0/libs/serialization/doc/index.html
Upvotes: 1
Reputation: 211
Fstream.write simply writes raw data to a file. The first parameter is a pointer to the starting address of the data. The second parameter is the length (in bytes) of the object, so write knows how many bytes to write.
file.write((char*)&object,sizeof(className));
^ This line is converting the address of object to a char pointer.
historyFile.write((char*)i,sizeof(Referral));
^ This line is trying to convert an object (i) into a char pointer (not valid)
historyFile.write(i,sizeof(Referral));
^ This line is passing write an object, when it expects a char pointer.
Upvotes: 0