Using Open Street Map data for Australia

I’ve been using Open Street Map (OSM) data for Australia for generating matrices of travel times and distances that are suitable for input to truck routing optimisation problems. OSM is a world-wide map data set that anyone can edit and is impressively comprehensive.

Some of the resources I’ve used are www.osmaustralia.org, which does a regular batch job to extract country by country OSM data, and http://keepright.x10hosting.com/ which keeps an up to date list of map data errors. These types of errors are important to know about when producing routable data from OSM, in particular dead end one way streets and “almost junctions”.

I’ve been using Quantum GIS (QGIS) to look at the data after converting it to MID/MIF format. I can get it to label the lines that are one way streets, but unfortunately there is no way to show the line direction in QGIS, which is a major pain. (Actually there is a way but you have to get a source code branch from their repo and build it all etc etc). [Now that I think of it, a silly cheat way to do it would also be to produce another field in the map data, say a letter which represents whether the arc is “D” for down or “L” for left etc (based on the difference in the lat/longs of the start and end nodes of each arc), and label that way.]

I also found that the performance of QGIS is quite different depending on whether the data is opened as TAB or MIDMIF format. TAB format is fairly fast (just zooming, panning etc) but MIDMIF is quite noticeably slower. You’d think it wouldn’t make a difference as it would be using the same internal data representation, but obviously not.

I’m using some extra layers to show some of the processed data (picture below). For example, I have a layer which just shows the arcs involved in restricted turns, which I can layer on top of the street network and use a thicker line style for. I also use dots in another layer which I have produced based on my “island” processing. The dots represent “orphaned” nodes which are not on the main “island” of map data (which is connected in the sense that all nodes can reach all other nodes). These orphaned nodes will be ignored by the travel time calculation. There are around 9600 of these nodes in the entire set of Australia OSM routable data I’m using (which has 620042 nodes and 1503668 arcs in total). This filtered subset of OSM data I am using includes only street segments with a “highway” tag – this will exclude cycleways, pathways, hiking trails, ferry routes, ski slopes, building outlines, administrative boundaries, waterways etc.

Some observations on the OSM data (for Australia):
• One way information seems quite complete.
• Only 12% of the streets have road speed information (“max_speed” tag). This is an issue as vehicle routing needs highways to have a faster speed otherwise they won’t be used (and there will be lots of rat-running for example). In longer routes (e.g. interstate) the travel time will also be grossly overestimated. A couple of things we could do here: search for all segments with “highway”, “motorway”, or “freeway” in their names and assume some sort of speed like 80km/h on these segments. Or, with a bit more coding effort, if there are chains of segments where one has a speed and the others with the same street name don’t, assume that speed on all of those segments.
• About 70% of the road segments have street names. Some of the unnamed segments are bits of roundabout, service roads, motorway on/off-ramps etc. But, there are also a lot of streets which are classified (by their “highway” tag) as “secondary”, “tertiary”, “unclassified” and even “residential” which are not named. This is an issue when vehicle routing needs to produce driver directions or verbose path information.
• There’s only several hundred instances of streets which have restricted manoeuvre information (banned right hand turns being the prime example). Most of these instances are in Sydney. In reality this number should be in the thousands or tens of thousands. This will likely be the biggest issue from a routing quality point of view – it will mean you can do many illegal right turns.
• I found a weird character or two at the end of the file, which causes both Python and C++ file reading functions to get confused and read blanks forever. Weird, but easily avoided.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published.