irahorecka
irahorecka

Reputation: 1807

Reading a protocol buffer file in Python

I am making requests to a public transportation API for data analysis. Several files are in JSON format, which is easy to deal with; however, some files are in .protobuf format.

I am curious how to parse these files into a human-readable format. For example, if I open the .protobuf file in a text editor, this is what I receive:

1"$
+(B���uC-(�̉�B
1462;
2"6

    741127020 *F
�ZB�����C-(�̉�B
1583<
3"7

    719255020 *10
K�B�8��C-FH@(�̉�B
1220<
4"7

Thanks!

Upvotes: 1

Views: 1285

Answers (1)

Mark
Mark

Reputation: 92440

Protobuffer is a binary format, so it's not human readable in it's raw state. To read it, go get the python bindings from Google and install with:

pip install --upgrade gtfs-realtime-bindings

Once you have those, you can download the pb file or read it locally very easily:

from google.transit import gtfs_realtime_pb2
import urllib.request 

feed = gtfs_realtime_pb2.FeedMessage()
pb_url = "http://someURL/someFile.pb"

with urllib.request.urlopen(pb_url) as response:
    feed.ParseFromString(response.read())
    print(feed)

This will give you something like:

header {
  gtfs_realtime_version: "1.0"
  incrementality: FULL_DATASET
  timestamp: 1579313685
}
entity {
  id: "10-abc-O-1"
  trip_update {
    trip {
      trip_id: "10-1622-O-1"
    }
...

Upvotes: 2

Related Questions