Reputation: 53
Trying to deserialize a message using protobuf in Java and getting the below exception.
Caused by: com.google.protobuf.InvalidProtocolBufferException: While parsing a protocol message, the input ended unexpectedly in the middle of a field. This could mean either that the input has been truncated or that an embedded message misreported its own length. at com.google.protobuf.InvalidProtocolBufferException.truncatedMessage(InvalidProtocolBufferException.java:86) at com.google.protobuf.CodedInputStream$ArrayDecoder.readRawLittleEndian64(CodedInputStream.java:1179) at com.google.protobuf.CodedInputStream$ArrayDecoder.readFixed64(CodedInputStream.java:791) at com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:534) at com.google.protobuf.GeneratedMessageV3.parseUnknownFieldProto3(GeneratedMessageV3.java:305)
Upvotes: 1
Views: 3153
Reputation: 1062925
I've manually decoded your string, and I agree with the library: your message is truncated. I'm guessing that this is because you're using string-based APIs, and there is a zero-byte in the data - many text APIs see a zero-byte (NUL
in ASCII terms) to mean the end of the string.
Here's the breakdown:
\n=10=field 1, length prefix - I'm assuming this is a string
\x14=20
"id:article:v1:964000"
(22 bytes used for field 1)
\x12=18=field 2, length prefix - I'm assuming this is a sub-messssage
$=36
\n=10=field 1, length prefix - I'm assuming this is a string
\x10=16
"predicted_topics"
(18 bytes used for field 2.1)
\x12=18=field 2, length prefix - I'm assuming this is a string
\x06=6
"IS/biz"
(8 bytes used for field 2.2)
\x1a=26=field 3, length prefix - I'm assuming this is "bytes"
\x08=8
\xf0
l
\x8f
\xde
p
\x9f
\xe4
(unexpected EOF)
at the end, we're trying to decode 8 bytes of the inner-most message, and we've only got 7 bytes left. I know this isn't a sub-message because that would result in an invalid tag, and it doesn't look like UTF-8, so I'm assuming that this is a bytes
field (but frankly it doesn't matter: we need 8 bytes, and we only have 7).
My guess is that the last byte in the bytes
field was zero; if we assume a missing \x00
at the end, then field 2.3 is 10 bytes, and we've accounted for 18+8+10=36 bytes, which would make the sub-message (field 2) complete. There may well be more missing data after the outer sub-message - I have no way of knowing.
So: make sure you're not using text-based APIs with binary data.
Upvotes: 2