Reputation: 87
I used to define a proto
file, for example
option java_package = "proto.data";
message Data {
repeated string strs = 1;
repeated int ints = 2;
}
I received from network this object's inputstream (or bytes). Then, normally, I do a parsing like Data.parserFrom(stream)
or Data.parserFrom(bytes)
to get the object.
By this, I have to hold full memory on Data object while I just need travel all string and integer values in the object. It's bad when the object size is big.
What should I do for this issue?
Upvotes: 2
Views: 4130
Reputation: 419
Hmm. It appears that it may be already implemented but not adequately documented. Has you tested it ?
See for discussion: https://groups.google.com/forum/#!topic/protobuf/7vTGDHe0ZyM
See also, sample test code in google's github: https://github.com/google/protobuf/blob/4644f99d1af4250dec95339be6a13e149787ab33/java/src/test/java/com/google/protobuf/lazy_fields_lite.proto
Upvotes: 0
Reputation: 45316
Unfortunately, there is no way to parse just part of a protobuf. If you want to be sure that you've seen all of the strs
or all of the ints
, you have to parse the entire message, since the values could appear in any order or even interleaved.
If you only care about memory usage and not CPU time then you could, in theory, use a hand-written parser to parse the message and ignore fields that you don't care about. You still have to do the work of parsing, you can just discard them immediately rather than keeping them in memory. However, to do this you'd need to study the Protobuf wire format and write your own parser. You can use Protobuf's CodedInputStream
class but a lot of work still needs to be done manually. The Protobuf library really isn't designed for this.
If you are willing to consider using a different protocol framework, Cap'n Proto is extremely similar in design to Protobufs but features in the ability to read only the part of the message you care about. Cap'n Proto incurs no overhead for the fields you don't examine, other than obviously the bandwidth and memory to receive the raw message bytes. If you are reading from a file, and you use memory mapping (MappedByteBuffer
in Java), then only the parts of the message you actually use will be read from disk.
(Disclosure: I am the author of most of Google Protobufs v2 (the version you are probably using) as well as Cap'n Proto.)
Upvotes: 2