Recognising the importance of opening up transport data, Ideas in Transit has been supporting the addition of information related to proposed UK transport schemes to Wikipedia and also into Freebase allowing both the public and policymakers to access core information about transport schemes much more quickly.
The initial work has been for the East of England where we have identified 93 schemes relating to airports, buses, roads, rail, shipping, cycling and walking with a total cost of in excess of £20billion.
For each scheme we have tried to provide a summary in a standard format identifing what is being proposed, why it is being proposed, what work is required, who is promoting it, what it would cost and when it might be built; we have also included details of any opposition to the scheme and the basis of this opposition. Information about schemes is also being presented as maps and in computer readable form (allowing it to be exported from Wikipedia and be searched using SPARQL).
We estimate that the inclusion of transport information within Wikipedia has reduced the time need to identify and understand a typical transport scheme from about 2 hours to 3 minutes. When we are able to search for schemes using SPARQL it will become even easier to find schemes of relevance to ones enquiry.
Here is a map giving details of 5 different transport schemes as detailed in the article about tranport in the Luton area.
And here is a map showing allow the recent and proposed developments, as detailed in the M25 motorway article.
As well as providing textual information and maps for schemes (which we are now publishing in svg format to allow others to edit them), we have also started publishing information as KML using Umapper such as this cycling/walking/public transport scheme in Ipswich and the East West Rail Link. Use of OpenStreetMap base mapping on Umapper allows us to publish this information using an open data license. Using KML allows the route of schemes to be exported into other applications for a variety of purposes.
For each scheme we have also created an Infobox; there was not already a suitable Infobox available within Wikipedia, so after discussion with other members of the Wikipedia community we created a new future infrastructure project Infobox which we have now deployed in a number of articles. The main advantage of using Infoboxes is that the information can automatically be ‘scrapped’ from Wikipedia into projects such as DBPedia and Freebase. Information about a number of schemes have already been imported into these databases, for example on the Norwich Northern Distributor Road.
The integration of this content into Freebase is still ‘work in progress’ with issues that need to be resolved both about having multiple infoboxes on a single page and with importing KML into the databases. When these issues are resolved it will be possible to do SPARQL queries to find schemes that meet certain criteria, which will also be able to include spatial information. For example one would be be able to search for all schemes passing within x miles of a point, or all schemes within a defined area that relate to cycling. The availability of this information in a form where it can be searched using SPARQL will allow this information to used through the UK Governments open data site as promoted by Gordon Brown and Sir Tim Berners Lee.
Our experience to date has been that the time taken to understand a transport scheme varies from 15 minutes to many hours with an average of about 2 hours to research a scheme prior to writing about it. The time taken to research schemes is so long because the information is often buried in multiple different large pdf documents, there is often out-of-date information available, undated information and incorrect information. Maps are only sometimes provided but even where they are provided they can be of very poor quality, or be unusable due to excessive compression.
We also found that it was often hard to discover what schemes exist because the information about schemes is published on many different websites. Rail schemes are detailed within large PDF documents available from the Network Rail website, Highway Agency schemes are covered on their site organised by regions which are different from the Network Rail regions and regional assembly regions); other road schemes are normally on Transport Authority websites in a variety of formats, cycling schemes are likely to be on the Sustrans website and further schemes may be included in Local Development Frameworks produced by Borough and District councils , with information about ports, airport and ‘eco-towns’ to consider as well. In all we have identified over 25 web sites which may contain scheme information in the East of England. A further difficulty is that schemes have multiple names which can change over time. For example the Norwich Northern Distributor Road is also known a the Norwich Northern Distributor Route and the NDR.
We have received support from the Wikipedia community with others contributing to the articles, adding information, correcting mistakes and providing updates on occasions. It is however our belief that it will be advantageous to continue to populate information for the rest of the country as part of Ideas in Transit and not rely completely on voluntary effort.
To see this for yourself spend 20 minutes trying to find out about the ‘Lower Thames Crossing’ without using the Wikipedia article or any derivatives of Wikipedia (which include freebase and absoluteAstronomy and others). Try to find out what the current situation is, what it would cost and where it would be built. Then read the Wikipedia article we contributed to see if you are correct. You could also try selecting 3 schemes at random from the East of England scheme list and try the same test. Do add comments to the bottom of this post with your experiences.
This work has been carried out by ITO World Ltd with support from Ideas in Transit and their sponsors who consist of the Technology Strategy Board, the Department for Transport, and the EPSRC.