thoughtbot is hosting the Great American Hackathon at our Boston office today and tomorrow. The focus of our group is on transportation data from the Massachusetts Bay Transportation Authority (MBTA).
Like many other government agencies, the public loves to trash the MBTA for its actual and perceived failures while simultaneously taking advantage of its services at levels of about 400 million riders a year. In response, the agency’s knee-jerk reaction has been to historically be less transparent.
I believe it is important as application developers to support an effort underway by the agency to reverse that trend. We have a window of about a year (until November, 2010) to show government officials what is possible when they provide open data to the public.
Here’s my current understanding of the lay of the land and some ideas of things to hack on.
MassDOT was created November, 2009. It combines almost all transportation departments in Massachusetts, such as:
- Registry of Motor Vehicles
The T (MBTA) still has a complicated legal relationship with MassDOT. They are kind of semi-private. I imagine them as similar to the Federal Reserve or The Pentavirate.
The main real-time data provider is a vendor the government has hired, NextBus.
This was the winning visualization from a contest last month.
For our purposes, the main thing we care about is real-time XML feed they are providing as part of a one year pilot program. It is possible that feed will be turned off November, 2010 if developers don’t create anything interesting or Those Fatcats don’t see value in transparency.
Legally, they are required to follow a particular procurement system.
Cannot change length of the procurement process but developers and the press can push for more lines to open up.
There’s static data and real-time data.
The static data is available on the MassDOT Developers page. The current zip file is from December 2:
The data is in Google Transit Feed Spec (GTFS) format, for interesting reasons. The Federal Transit Administration been holding meetings about standard data formats for a long time, then Google came along and just did it. Now pretty much every agency (the MBTA in Boston, the MTA in New York, etc.) can export into GTFS.
Some of the other data such as station locations and routes is in Keyhole Markup Language (KML) files.
Perhaps the best way to explore this data is through danchoi/openmbta, an open source Rails app and iPhone app. Some of things it contains are:
- stops for vehicles (“Alewife Station”, “Broadway Station”, “Cleveland Circle”, etc.)
- data for subways, buses, ferries, commuter rails
The app is live at OpenMBTA.org.
Real-time data is what everyone is interested in.
MassDOT Highways have a daily XML feed for planned construction events, answering the question, “which roads are closed?”.
The Registry of Motor Vehicles has a branch wait-time XML feed.
The MBTA has a services advisory and updates RSS feed.
The Granddaddy feed everyone is excited about, however, is the MBTA real-time XML feed.
The trial feed includes data for bus route 39 which serves Jamaica Plain, the Longwood Medical Area, and Back Bay in Boston; and bus routes 111, 114, 116, and 117 which serve Haymarket Station, East Boston, Chelsea, and Revere.
Currently, a major problem as I see it with this kind of app is that NextBus is a vendor with a patented, proprietary system for calculating the wait time. If the MBTA switches vendor or this genius system turns out to suck, customers of a wait time app will be pissed.
I’d like to see this data supplemented with user-generated content (was the train late or on time?) or see people write their own, open source algorithms based on lat/long, stops, times, rush hour traffic, whatever.
Tap out. Unlike, say, the London Tube, the T is a “tap in” system. Riders don’t have to tap out to leave the system. Therefore, developers and trend analysts are limited in their data sets of how people are using the T. A mobile app could be built that allows riders to “tap out”, simply there to built up the data sets.
Between stops API. A very simple API could be layered on the real-time NextBus data that says which stops the next vehicle is between. This could turn out to be a more accurate prediction for certain riders, particularly commuters who are familiar with the line.
Demographics. One thing that isn’t clear for application developers is: who is the customer?. For something as broad as public transportation, there is a wide variety of users, such as:
- college students
- the eldery
- the handicapped
- daily commuters
They all could have very different needs. A simple user-generated application to build demographics associated with each line could be very useful for developers of future applications. For example, I ride the Red Line and the 1 bus. I know when commuters to downtown Boston ride it, when MIT and Harvard students ride it, etc.
Sign up for the MassDOTDevelopers Google Group.
Talk to local news stations like WBZ, NECN, WBUR, Fox 25, and boston.com. They already add value to the government data by flying helicopters, watching traffic cameras, and passing that along to their customers. There’s potentially more data at those businesses that could be added to the mix. A company called Smart Route Systems is often the point company for those news stations.
Join us today and tomorrow at our Boston office as we hack together an app or two!