Reputation: 1358
Consider the following flatbuffers schema (from this stack overflow question):
table Foo {
...
}
table Bar {
value:[Foo];
}
root_type Bar;
Assume the number of Foo
s in a typical object is significant so we want to avoid modifying schema to make Foo
the root_type
.
Scenario:
A C++ client serializes a proper flatbuffers object and posts it to another component (nodejs backend) that partially deserializes the object and stores the binary representing every Foo
in a database as separate documents:
const buf = new flatbuffers.ByteBuffer(req.body)
const bar = fbs.Bar.getRootAsBar(buf)
for (let i = 0; i < bar.valueLength(); i++) {
const foo = bar.value(i)
let item = {
'raw': foo.bb.bytes_ // <-- primary suspect
}
// ... store `item` as an individual entity (mongodb doc)
}
Later, a third component fetches the binary data stored in "raw" key of the mongodb documents and tries to deserialize it into a Foo
object:
auto mongoCol = db.collection("results");
auto mongoResult = mongoCol.find_one(
bsoncxx::builder::stream::document{}
<< "_id" << oid << bsoncxx::builder::stream::finalize);
// ...check that mongoResult is not null
const auto result = mongoResult->view();
const auto& binary = result["raw"].get_binary();
std::string content((const char*)binary.bytes, binary.size);
const auto& foo = flatbuffers::GetRoot<fbs::Foo>(content.c_str());
The problem:
But the pointer given as foo
does not point to the expected data and any operation on foo
potentially leads to segfault or access violation.
Suspicions:
I speculate that the root cause is that the binary that is stored in the database uses offsets according to the original message. So it is essentially invalid in its own original format and the offsets should be readjusted before inserting into database. But I do not see any flatbuffers function API to readjust the offsets?
One less likely root cause may be that the final deserialization code is incomplete and we have to readjust the offsets?
The reason I suspect it is related to offsets is the fact that this same code works just fine if we make a compromise and post smaller flatbuffers objects with one Foo element in every Bar vector (and change backend code to store bar.bb.bytes
in raw
instead).
Question:
In any way, is it even possible to grab part of a larger properly constructed flatbuffers binary file that you know represents your desired table and deserialize it on its own?
Upvotes: 1
Views: 1327
Reputation: 6074
You can't simply copy a sub-table out of a larger FlatBuffer byte-wise, since this data is not necessarily contiguous. The best workaround is to instead make Bar
store a [FooBuffer]
where table FooBuffer { buf:[byte] (nested_flatbuffer: Foo) }
. When you construct one of these, you construct each Foo
into its own FlatBufferBuilder
and then store the resulting bytes in the parent. Then when you need to stores Foo
s seperately this then becomes an easy copy.
Upvotes: 1