What Happens When Data Is Available

by @jehiah on 2009-02-02 16:09
Filed under: All, Trains, transit

It's been a few weeks, and I want to say that the Google Maps 'Transit Layer' is a cool feature, and the team that put it together deserves an pat on the back. It's a great way to see the extent of mass transit, and thats useful for planning things like where to live, or where to work, and historically that's been difficult to do. I should also mention that it seems somewhat inspired from this transit time map (by Neil Kandalgaonkar) from back in August.

Another cool thing I came across recently is a simulation by Yuriy Yakimenko showing the complete traffic on several commuter rail systems (LIRR, MBTA, NJ Transit, SEPTA, Chicago METRA). The simulation shows the complete traffic on each agency as it changes through the day, and is a great way to help convey how frequent some stops are visited compared to others (aside from begin way cool). It illustrates why I can't ride the LIRR back into New York City at 2:56am.

The simulator Yuriy put together for MBTA hits closer to home because it was made possible by an unofficial GTFS data set I published link. Each of the simulators Yuriy put together uses data which he also uses to publish schedules applications for java based mobile devices.

The point I want to make here is that in all three of these cases, the innovation in new ways to display schedule information (and in turn better services/capabilities for riders) could only happen because of access to data.

I am really happy that this innovation is taking place, and that it's not limited to Google, but I do want to highlight the disparity in what innovation is currently possible, by pointing out how little schedule data is available publicly.

Out of all the agencies participating in Google Transit, (and I think it's over a hundred now, but I don't have a good method of counting), only 19, to my knowledge, publish data openly 1.

I think it's clear that when data is available, there are benefits. As I pointed out above, data spurs innovation that benefits transit riders in their local market. It spurs a wide range of applications and integration (just look at the iTunes AppStore Navigation Category to say nothing of my theNextTrain.com). I also know from personal experience that when an agency releases data it helps ensure up to date information on 3rd party websites and applications (to say nothing of the fact that it's often more complete, and has fewer errors introduced via conversion or scraping or manual re-entry).

Those are all upsides, what are the downsides for a transit agency if they publish data in an open format (ie: GTFS)? there are none! 2

So, THIS IS MY PLEA to others who appreciate and want to see mass transit flourish (and especially those involved in transit development). If your transit agency does not publish GTFS data publicly (and especially if they are already giving that data to Google), Write your local transit agency, and request that they publish data publicly.

I've tried to develop a spot online to make publishing data easier, and I would be happy to help work with any agency to get their data published.

Footnotes:
1: To my knowledge, all the publicly available GTFS data is accessible on gtfs-data-exchange.com , but many of the data sets there are unofficial or acquired through FOIA requests. Only 19 of those data sets are official.
2: Yes it may take some effort to convert data into an open format, but I know there are many developers like myself that would be happy to help agencies convert data to open formats from any raw format. This means that publishing any raw data has almost the same effect.

Subscribe via RSS ı Email
© 2014 - Jehiah Czebotar