Reputation: 509
I am intersted in replacing my current data format that I use with GTFS, but I hear and read from here and there that there are flaws in GTFS file format.
Most of the time I see that you can't somehow predict some things such as delays or some real-time stuff. They say you can't get the "whole picture" with them.
So what I am asking is there anyone more experienced with GTFS , since I am seeing them only for first time, that could have possibly used GTFS in some kind of application and could tell the problems they have faced while developing?
Maybe someone has a suggestion about a better kind of file format? Or a combination of some formats?
Upvotes: 2
Views: 1033
Reputation: 11
It depends on your needs for importing existing feeds. If yes, then you need to be able to handle it anyhow. In my case, import was required, so I use the same for data that stems from other formats like PDF timetables. Otherwise you need to supoprt two formats. If you do not need it for import (or export) you may consider your own format : I find GTFS does not reveal the actual network.
GTFS needs quite some interpretation and digesting in order to end up with the whole picture that you can answer planning questions on.
I merge stops together if they are close, like a few meters apart, and assume 'trivial walk' if 10-50 meters. That automatically handles combining multipe feeds.
Apart from that, I turn the stop_times roughly inside-out to create a 'link' table'. The end result is that for each stop you have a list of departures and their destinations.
Biggest problem until now is that GTFS feeds can record the trips from an operator point of view. Passengers can remain sitting in the bus if it flips the headsign from 351 to 285, takes a new driver onboard and continues. That means you need to know what trips actually need to be seen as joined in passenger terms.
I solved minor problem for manual feed entry by having my GTFS parser accept a handful of constructs that ease editing, such as leaving out the sequence numbers to have it generated incrementally, and recognising 02.13+1 as 26.13.
Upvotes: 1
Reputation:
It's hard to say whether GTFS is a good fit or not for your application without knowing what your application's requirements are, but I can offer a few remarks.
If your goal is to provide real-time data to users you should take a look at GTFS-realtime, a complementary data format designed specifically for issuing real-time updates. For most public-transit applications, using a GTFS and a GTFS-realtime feed together does indeed give the "whole picture" about a transit network, or near enough.
In terms of GTFS itself, my main complaint is that it seems designed specifically for route-planning applications and using data in this format for any other purpose can be difficult. For example, while a GTFS feed records information about transit stops and routes, there is no requirement that each of these have a single, canonical entry—if the data spans multiple board periods, there will almost always be (seemingly) duplicate entries for each.
This doesn't matter if you're plotting a route based on where and when a person is travelling, since the links between objects ensure you'll always generate the right result. If you're starting with only a person's location and want to know, "What transit resources are available nearby?", reliably producing an accurate answer requires some contortions.
Upvotes: 2