Reputation: 222
I am looking at the output of the protoc --decode command and I cannot fathom the encoding used when it encounters bytes :
data {
image: "\377\330\377\340\000\020JFIF\000\001[…]\242\2634G\377\331"
}
The […] was added by me to shorten the output.
What encoding is this?
Edit
So based on Bruce's answer I wrote my own utility in order to generate sample data from a shell script :
public static void main(String[] parameters) throws IOException {
File binaryInput = new File(parameters[0]);
System.out.println("\""+TextFormat.escapeBytes(ByteString.readFrom(new FileInputStream(binaryInput)))+"\"");
}
}
that way I can call serialize my binaries and insert them in a text serialization of a protobuf before calling protoc --encode on it :
IMAGE=$(mktemp)
OUTPUT=$(mktemp)
BIN_INSTANCE=$(mktemp)
echo -n 'capture: ' > $IMAGE
java -cp "$HOME/.m2/repository/com/google/protobuf/protobuf-java/3.0.0/protobuf-java-3.0.0.jar:target/protobuf-generator-1.0.0-SNAPSHOT.jar" protobuf.BinarySerializer image.jpg >> $IMAGE
sed -e 's/{UUID}/'$(uuidgen)'/' template.protobuf > $OUTPUT
sed -i '/{IMAGE}/ {
r '$IMAGE'
d
}' $OUTPUT
cat $OUTPUT | protoc --encode=prototypesEvent.proto> $BIN_INSTANCE
with template.protobuf being :
uuid: "{UUID}"
image {
capture: "{IMAGE}"
}
Upvotes: 3
Views: 2330
Reputation: 10543
I am presuming it is the same as produced by java.
basically:
so in the above \377 is 1 byte: 377 octal or 255 in decimal.
"\377\330\377\340 = 255 216 255 224
You should be able to copy the string into a Java/C program and convert it to bytes
The Java code looks to be:
static String escapeBytes(final ByteSequence input) {
final StringBuilder builder = new StringBuilder(input.size());
for (int i = 0; i < input.size(); i++) {
final byte b = input.byteAt(i);
switch (b) {
// Java does not recognize \a or \v, apparently.
case 0x07: builder.append("\\a"); break;
case '\b': builder.append("\\b"); break;
case '\f': builder.append("\\f"); break;
case '\n': builder.append("\\n"); break;
case '\r': builder.append("\\r"); break;
case '\t': builder.append("\\t"); break;
case 0x0b: builder.append("\\v"); break;
case '\\': builder.append("\\\\"); break;
case '\'': builder.append("\\\'"); break;
case '"' : builder.append("\\\""); break;
default:
// Only ASCII characters between 0x20 (space) and 0x7e (tilde) are
// printable. Other byte values must be escaped.
if (b >= 0x20 && b <= 0x7e) {
builder.append((char) b);
} else {
builder.append('\\');
builder.append((char) ('0' + ((b >>> 6) & 3)));
builder.append((char) ('0' + ((b >>> 3) & 7)));
builder.append((char) ('0' + (b & 7)));
}
break;
}
}
return builder.toString();
}
taken from com.google.protobuf.TextFormatEscaper
Upvotes: 2