5 This page describes the development of the algorithm that is used in
6 Routino for finding routes.
12 The algorithm to find a route is fundamentally simple: Start at the
13 beginning, follow all possible routes and keep going until you reach
16 While this method does work, it isn't fast. To be able to find a route
17 quickly needs a different algorithm, one that can find the correct
18 answer without wasting time on routes that lead nowhere.
24 The simplest way to do this is to follow all possible segments from the
25 starting node to the next nearest node (an intermediate node in the
26 complete journey). For each node that is reached store the shortest
27 route from the starting node and the length of that route. The list of
28 intermediate nodes needs to be maintained in order of shortest overall
29 route on the assumption that there is a straight line route from here
31 At each point the intermediate node that has the shortest potential
32 overall journey time is processed before any other node. From the first
33 node in the list follow all possible segments and place the newly
34 discovered nodes into the same list ordered in the same way. This will
35 tend to constrain the list of nodes examined to be the ones that are
36 between the start and end nodes. If at any point you reach a node that
37 has already been reached by a longer route then you can discard that
38 route since the newly discovered route is shorter. Conversely if the
39 previously discovered route is shorter then discard the new route.
40 At some point the end node will be reached and then any routes with
41 potential lengths longer than this actual route can be immediately
42 discarded. The few remaining potential routes must be continued until
43 they are found to be shorter or have no possibility of being shorter.
44 The shortest possible route is then found.
46 At all times when looking at a node only those segments that are
47 possible by the chosen means of transport are followed. This allows the
48 type of transport to be handled easily. When finding the quickest route
49 the same rules apply except that criterion for sorting is the shortest
50 potential route (assuming that from each node to the end is the fastest
51 possible type of highway).
53 This method also works, but again it isn't very fast. The problem is
54 that the complexity is proportional to the number of nodes or segments
55 in all routes examined between the start and end nodes. Maintaining the
56 list of intermediate nodes in order is the most complex part.
62 The final algorithm that is implemented in the router is basically the
63 one above but with an important difference. Instead of finding a long
64 route among a data set of 8,000,000 nodes (number of highway nodes in
65 UK at beginning of 2010) it finds one long route in a data set of
66 1,000,000 nodes and a few hundred very short routes in the full data
67 set. Since the time taken to find a route is proportional to the number
68 of nodes the main route takes 1/10th of the time and the very short
69 routes take almost no time at all.
71 The solution to making the algorithm fast is therefore to discard most
72 of the nodes and only keep the interesting ones. In this case a node is
73 deemed to be interesting if it is the junction of two segments with
74 different properties. In the algorithm these are classed as
75 super-nodes. Starting at each super-node a super-segment is generated
76 that finishes on another super-node and contains the shortest path
77 along segments with identical properties (and these properties are
78 inherited by the super-segment). The point of choosing the shortest
79 route is that since all segments considered have identical properties
80 they will be treated identically when properties are taken into
81 account. This decision making process can be repeated until the only
82 the most important and interesting nodes remain.
84 To find a route between a start and finish point now comprises the
85 following steps (assuming a shortest route is required):
87 1. Find all shortest routes from the start point along normal segments
88 and stopping when super-nodes are reached.
89 2. Find all shortest routes from the end point backwards along normal
90 segments and stopping when super-nodes are reached.
91 3. Find the shortest route along super-segments from the set of
92 super-nodes in step 1 to the set of super-nodes in step 2 (taking
93 into account the lengths found in steps 1 and 2 between the
94 start/finish super-nodes and the ultimate start/finish point).
95 4. For each super-segment in step 3 find the shortest route between
96 the two end-point super-nodes.
98 This multi-step process is considerably quicker than using all nodes
99 but gives a result that still contains the full list of nodes that are
100 visited. There are some special cases though, for example very short
101 routes that do not pass through any super-nodes, or routes that start
102 or finish on a super-node. In these cases one or more of the steps
103 listed can be removed or simplified.
107 One of the important features of Routino is the ability to select a
108 route that is optimum for a set of criteria such as preferences for
109 each type of highway, speed limits and other restrictions and highway
112 All of these features are handled by assigning a score to each segment
113 while calculating the route and trying to minimise the score rather
114 than simply minimising the length.
117 When calculating the shortest route the length of the segment is
118 the starting point for the score.
121 When calculating the quickest route the time taken calculated
122 from the length of the segment and the lower of the highway's
123 own speed limit and the user's speed preference for the type of
124 highway is the starting point for the score.
127 If a highway has the oneway property in the opposite direction
128 to the desired travel and the user's preference is to obey
129 oneway restrictions then the segment is ignored.
131 Weight, height, width & length limits
132 If a highway has one of these limits and its value is less than
133 the user's specified requirement then the segment is ignored.
136 The highway preference specified by the user is a percentage,
137 these are scaled so that the most preferred highway type has a
138 weighted preference of 1.0 (0% always has a weighted preference
139 of 0.0). The calculated score for a segment is divided by this
143 The other highway properties are specified by the user as a
144 percentage and each highway either has that property or not. The
145 user's property preference is scaled into the range 0.0 (for 0%)
146 to 2.0 (for 100%) to give a weighted preference, a second
147 "non-property" weighted preference is calcuated in the same way
148 after subtracting the user's preference from 100%. If a segment
149 has this property then the calculated score is divided by the
150 weighted preference, if the segment does not have this property
151 then it is divided by the non-property weighted preference.
156 The hardest part of implementing this router is the data organisation.
157 The arrangement of the data to minimise the number of operations
158 required to follow a route from one node to another is much harder than
159 designing the algorithm itself.
161 The final implementation uses a separate table for nodes, segments and
162 ways. Each table individually is implemented as a C-language data
163 structure that is written to disk by a program which parses the
164 OpenStreetMap XML data file. In the router these data structures are
165 memory mapped so that the operating system handles the problems of
166 loading the needed data blocks from disk.
168 Each node contains a latitude and longitude and they are sorted
169 geographically so that converting a latitude and longitude coordinate
170 to a node is fast as well as looking up the coordinate of a node. The
171 node also contains the location in the array of segments for the first
172 segment that uses that node.
173 Each segment contains the location of the two nodes as well as the way
174 that the segment came from. The location of the next segment that uses
175 one of the two nodes is also stored; the next segment for the other
176 node is the following one in the array. The length of the segment is
177 also pre-computed and stored.
178 Each way has a name, a highway type, a list of allowed types of
179 traffic, a speed limit, any weight, height, width or length
180 restrictions and the highway properties.
182 The super-nodes are mixed in with the nodes and the super-segments are
183 mixed in with the segments. For the nodes they are the same as the
184 normal nodes, so just a flag is needed to indicate that they are super.
185 The super-segments are in addition to the normal segments so they
186 increase the database size (by about 10%) and are also marked with a
193 At the time of writing (April 2010) the OpenStreetMap data for Great
194 Britain (taken from GeoFabrik) contains:
196 + 8,767,521 are highway nodes
197 + 1,120,297 are super-nodes
199 + 1,412,898 are highways
200 o 9,316,328 highway segments
201 o 1,641,009 are super-segments
204 The database files when generated are 41.5 MB for nodes, 121.6 MB for
205 segments and 12.6 MB for ways and are stored uncompressed. By having at
206 least 200 MB or RAM available the routing can be performed with no disk
207 accesses (once the data has been read once).
212 Copyright 2008-2010 Andrew M. Bishop.