Explorex
Explorex

Reputation: 577

Dealing with returned Protocol Buffer Files

I'm sending a get request to an API to find out Public Transport Real-time Positions. In their documentation, they state it returns two protocol buffer files.

The problem is, I'm programming in python and cannot find any resources on how to deal with these returned files. The majority of the information I've searched for online regarding protocol buffer files is how to create one (serialization).

Is there any example code showing how to deal with (deserialize?) a returned get request containing protocol buffer files?

Apologies if I've misinterpreted or used incorrect terminology, please let me know if I have!

Here is the only document I can find that clarifies that it returns two protocol files and the structure(?) of what is returned

Upvotes: 1

Views: 292

Answers (2)

Jan-Gerd
Jan-Gerd

Reputation: 1289

Chapter 3.1.1. of your linked document states that the data is compliant with a reference published by Google, including a link to the relevant page. That page contains the .proto definitions for your data.

3.1.1 GTFS Compliance

The GTFS bundle is compliant with the specification reference published by google on 3 February 2016. The GTFS real time feed is also compliant with the GTFS reference published by google on 26 February 2015. The references for both feed components specifications can be found at the following URLs:

Download the gtfs-realtime.proto file from the website and compile it using protoc --python_out=. gtfs-realtime.proto. This will create a gtfs_realtime_pb2.py that you can use like this:

import requests
import gtfs_realtime_pb2

api_key = "<Your API key>"

session = requests.Session()
session.headers.update({"Authorization": "apikey " + api_key})
response = session.get("https://api.transport.nsw.gov.au/v1/gtfs/vehiclepos/nswtrains")
message = gtfs_realtime_pb2.FeedMessage()
message.ParseFromString(response.content)
print(message)

For an output like this

header {
  gtfs_realtime_version: "1.0"
  incrementality: FULL_DATASET
  timestamp: 1575847311
}
entity {
  id: "1"
  vehicle {
    trip {
      trip_id: "165.011219.127.0840"
      schedule_relationship: SCHEDULED
      route_id: "4T.C.165"
    }
    position {
      latitude: -28.636098861694336
      longitude: 153.54798889160156
      bearing: 273.8424072265625
    }
    timestamp: 1575847282
    congestion_level: UNKNOWN_CONGESTION_LEVEL
    stop_id: "24811"
    vehicle {
      id: "165"
      label: "08:40am (165)  Casino - Tweed Heads"
    }
  }
}
entity {
  id: "2"
...

Upvotes: 1

Marc Gravell
Marc Gravell

Reputation: 1062502

Protobuf is not a self-describing protocol. Whoever is publishing that API should also be publishing the schema definition (usually a .proto file), that you can run through the readily available tooling. Ask them for the schema file (note: they also need to tell you what the root message type is).

Note that it is possible to reverse engineer the schema from a raw payload, but it is time consuming, requires an understanding of what the data should look like (many fields can be interpreted in multiple different ways, giving different results - you need to know which is "right"), and loses all of the semantic meaning (names, etc). If you can get the .proto, all of this is avoided. If you can't... I might be able to help some.

Upvotes: 0

Related Questions