alone
alone

Reputation: 179

decompiling a bin file of protobufs-net

i have a serialized bin file of protobufs, written mainly in protobufs-net. i want to decompile it, and see the structure of it.

i used some toolds like : https://protogen.marcgravell.com/decode

and i also used protoc:

protoc --decode_raw < ~/Downloads/file.bin

and this is part of the result i get:

1 {
  1: "4f81b7bb-d8bd-e911-9c1f-06ec640006bb"
  2: 0x404105b1663ef93a
  3: 0x4049c6158c593f36
  4: 0x40400000
  5 {
    1: "53f8afde-04c6-e811-910e-4622e9d1766e"
    2 {
      1: "e993fba0-8fc9-e811-9c15-06ec640006bb"
    }
    2 {
      1: "9a7c7210-3aca-e811-9c15-06ec640006bb"
      2: 1
    }
    2 {
      1: "2d7d12f1-2bc9-e811-9c15-06ec640006bb"
    }
    3: 18446744073709551615
  }
  6: 46
  7: 1571059279000
}

how i can decompile it? i want to know the structure and change data in it and make a new bin file.

Upvotes: 1

Views: 583

Answers (1)

Marc Gravell
Marc Gravell

Reputation: 1062820

Reverse engineering a .proto file is mostly a case of looking at the output of the tools such as you've mentioned, and trying to write a .proto that looks similar. Unfortunately, a number of concepts are ambiguous if you don't know the schema, as multiple different data types and shapes share the same encoding details, but... we can make guesses.

Looking at your output:

1 {
...
}

tells us that our root message probably has a sub-message at field 1; so:

message Root {
    repeated Foo Foos = 1;
}

(I'm guessing at the repeated here; if the 1 only appears once, it could be single)

with everything at the next level being our Foo.

  1: "4f81b7bb-d8bd-e911-9c1f-06ec640006bb"
  2: 0x404105b1663ef93a
  3: 0x4049c6158c593f36
  4: 0x40400000
  5: { ... }
  6: 46,
  7: 1571059279000

this looks like it could be

message Foo {
  string A = 1;
  sfixed64 B = 2;
  sfixed64 C = 3;
  sfixed32 D = 4;
  repeated Bar E = 5; // again, might not be "repeated" - see how many times it occurs
  int64 F = 6;
  int64 G = 7;
}

however; those sfixed64 could be double, or fixed64; and those sfixed32 could be fixed32 or float; likewise, the int64 could be sint64 or uint64 - or int32, sint32, uint32 or bool, and I wouldn't be able to tell (they are all just "varint"). Each option gives a different meaning to the value!

our Bar definitely has some kind of repeated, because of all the 2:

    1: "53f8afde-04c6-e811-910e-4622e9d1766e"
    2 { ... }
    2 { ... }
    2 { ... }
    3: 18446744073709551615

let's guess at:

message Bar {
  string A = 1;
  repeated Blap B = 2;
  int64 C = 3;
}

and finally, looking at the 2 from the previous bit, we have:

      1: "e993fba0-8fc9-e811-9c15-06ec640006bb"

and

      1: "9a7c7210-3aca-e811-9c15-06ec640006bb"
      2: 1

and

      1: "2d7d12f1-2bc9-e811-9c15-06ec640006bb"

so combining those, we might guess:

message Blap {
    string A = 1;
    int64 B = 2;
}

Depending on whether you have more data, there may be additional fields, or you may be able to infer more context. For example, if an int64 value such as Blap.B is always 1 or omitted, it might actually be a bool. If one of the repeated elements always has at most one value, it might not be repeated.

The trick is to to play with it until you can deserialize the data, re-serialize it, and get the exact same payload (i.e. round-trip).

Once you have that: you'll want to deserialize it, mutate the thing you wanted to change, and serialize.

Upvotes: 3

Related Questions