git.maemo.org Git - routino/blob - doc/html/algorithm.html

   1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
   2 <HTML>
   3
   4 <!--
   5  Routino documentation - algorithm
   6
   7  Part of the Routino routing software.
   8
   9  This file Copyright 2008-2010 Andrew M. Bishop
  10
  11  This program is free software: you can redistribute it and/or modify
  12  it under the terms of the GNU Affero General Public License as published by
  13  the Free Software Foundation, either version 3 of the License, or
  14  (at your option) any later version.
  15
  16  This program is distributed in the hope that it will be useful,
  17  but WITHOUT ANY WARRANTY; without even the implied warranty of
  18  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  19  GNU Affero General Public License for more details.
  20
  21  You should have received a copy of the GNU Affero General Public License
  22  along with this program.  If not, see http://www.gnu.org/licenses/.
  23 -->
  24
  25 <HEAD>
  26 <TITLE>Routino : Algorithm</TITLE>
  27 <META http-equiv="Content-Type" content="text/html; charset=UTF-8">
  28 <LINK href="style.css" type="text/css" rel="stylesheet">
  29 </HEAD>
  30
  31 <BODY>
  32
  33 <!-- Header Start -->
  34
  35 <div class="header" align="center">
  36
  37 <h1>Routino : Algorithm</h1>
  38
  39 <hr>
  40 </div>
  41
  42 <!-- Header End -->
  43
  44 <!-- Content Start -->
  45
  46 <div class="content">
  47
  48 <h2><a name="H_1_1"></a>Algorithms</h2>
  49
  50 This page describes the development of the algorithm that is used in Routino for
  51 finding routes.
  52
  53 <h3><a name="H_1_1_1"></a>Simplest Algorithm</h3>
  54
  55 The algorithm to find a route is fundamentally simple: Start at the beginning,
  56 follow all possible routes and keep going until you reach the end.
  57 <p>
  58 While this method does work, it isn't fast.  To be able to find a route quickly
  59 needs a different algorithm, one that can find the correct answer without
  60 wasting time on routes that lead nowhere.
  61
  62 <h3><a name="H_1_1_2"></a>Improved Algorithm</h3>
  63
  64 The simplest way to do this is to follow all possible segments from the starting
  65 node to the next nearest node (an intermediate node in the complete journey).
  66 For each node that is reached store the shortest route from the starting node
  67 and the length of that route.  The list of intermediate nodes needs to be
  68 maintained in order of shortest overall route on the assumption that there is a
  69 straight line route from here to the end node.
  70 <br>
  71 At each point the intermediate node that has the shortest potential overall
  72 journey time is processed before any other node.  From the first node in the
  73 list follow all possible segments and place the newly discovered nodes into the
  74 same list ordered in the same way.  This will tend to constrain the list of
  75 nodes examined to be the ones that are between the start and end nodes.  If at
  76 any point you reach a node that has already been reached by a longer route then
  77 you can discard that route since the newly discovered route is shorter.
  78 Conversely if the previously discovered route is shorter then discard the new
  79 route.
  80 <br>
  81 At some point the end node will be reached and then any routes with potential
  82 lengths longer than this actual route can be immediately discarded.  The few
  83 remaining potential routes must be continued until they are found to be shorter
  84 or have no possibility of being shorter.  The shortest possible route is then
  85 found.
  86 <p>
  87 At all times when looking at a node only those segments that are possible by the
  88 chosen means of transport are followed.  This allows the type of transport to be
  89 handled easily.  When finding the quickest route the same rules apply except
  90 that criterion for sorting is the shortest potential route (assuming that from
  91 each node to the end is the fastest possible type of highway).
  92 <p>
  93 This method also works, but again it isn't very fast.  The problem is that the
  94 complexity is proportional to the number of nodes or segments in all routes
  95 examined between the start and end nodes.  Maintaining the list of intermediate
  96 nodes in order is the most complex part.
  97
  98 <h3><a name="H_1_1_3"></a>Final Algorithm</h3>
  99
 100 The final algorithm that is implemented in the router is basically the one above
 101 but with an important difference.  Instead of finding a long route among a data
 102 set of 8,000,000 nodes (number of highway nodes in UK at beginning of 2010) it
 103 finds one long route in a data set of 1,000,000 nodes and a few hundred very
 104 short routes in the full data set.  Since the time taken to find a route is
 105 proportional to the number of nodes the main route takes 1/10th of the time and
 106 the very short routes take almost no time at all.
 107 <p>
 108 The solution to making the algorithm fast is therefore to discard most of the
 109 nodes and only keep the interesting ones.  In this case a node is deemed to be
 110 interesting if it is the junction of two segments with different properties.  In
 111 the algorithm these are classed as <em>super-nodes</em>.  Starting at each
 112 super-node a <em>super-segment</em> is generated that finishes on another
 113 super-node and contains the <em>shortest</em> path along segments with identical
 114 properties (and these properties are inherited by the super-segment).  The point
 115 of choosing the shortest route is that since all segments considered have
 116 identical properties they will be treated identically when properties are taken
 117 into account.  This decision making process can be repeated until the only the
 118 most important and interesting nodes remain.
 119 <p>
 120 <img alt="Original data" src="example0.png"><br>
 121 <img alt="Iteration 1" src="example1.png"><br>
 122 <img alt="Iteration 2" src="example2.png"><br>
 123 <p>
 124 To find a route between a start and finish point now comprises the following
 125 steps (assuming a shortest route is required):
 126 <ol>
 127   <li>Find all shortest routes from the start point along normal segments and
 128   stopping when super-nodes are reached.
 129   <li>Find all shortest routes from the end point backwards along normal
 130   segments and stopping when super-nodes are reached.
 131   <li>Find the shortest route along super-segments from the set of super-nodes
 132   in step 1 to the set of super-nodes in step 2 (taking into account the lengths
 133   found in steps 1 and 2 between the start/finish super-nodes and the ultimate
 134   start/finish point).
 135   <li>For each super-segment in step 3 find the shortest route between the two
 136   end-point super-nodes.
 137 </ol>
 138 This multi-step process is considerably quicker than using all nodes but gives a
 139 result that still contains the full list of nodes that are visited.  There are
 140 some special cases though, for example very short routes that do not pass
 141 through any super-nodes, or routes that start or finish on a super-node.  In
 142 these cases one or more of the steps listed can be removed or simplified.
 143
 144 <h3><a name="H_1_1_4"></a>Routing Preferences</h3>
 145
 146 One of the important features of Routino is the ability to select a route that
 147 is optimum for a set of criteria such as preferences for each type of highway,
 148 speed limits and other restrictions and highway properties.
 149 <p>
 150 All of these features are handled by assigning a score to each segment while
 151 calculating the route and trying to minimise the score rather than simply
 152 minimising the length.
 153 <dl>
 154   <dt>Segment length
 155   <dd>When calculating the shortest route the length of the segment is the
 156   starting point for the score.
 157   <dt>Speed preference
 158   <dd>When calculating the quickest route the time taken calculated from the
 159   length of the segment and the lower of the highway's own speed limit and the
 160   user's speed preference for the type of highway is the starting point for the
 161   score.
 162   <dt>Oneway restriction
 163   <dd>If a highway has the oneway property in the opposite direction to the
 164   desired travel and the user's preference is to obey oneway restrictions then
 165   the segment is ignored.
 166   <dt>Weight, height, width &amp; length limits
 167   <dd>If a highway has one of these limits and its value is less than the user's
 168   specified requirement then the segment is ignored.
 169   <dt>Highway preference
 170   <dd>The highway preference specified by the user is a percentage, these are
 171   scaled so that the most preferred highway type has a weighted preference of
 172   1.0 (0% always has a weighted preference of 0.0).  The calculated score for a
 173   segment is divided by this weighted preference.
 174   <dt>Highway properties
 175   <dd>The other highway properties are specified by the user as a percentage and
 176   each highway either has that property or not.  The user's property preference
 177   is scaled into the range 0.0 (for 0%) to 2.0 (for 100%) to give a weighted
 178   preference, a second "non-property" weighted preference is calcuated in the
 179   same way after subtracting the user's preference from 100%.  If a segment has
 180   this property then the calculated score is divided by the weighted preference,
 181   if the segment does not have this property then it is divided by the
 182   non-property weighted preference.
 183 </dl>
 184
 185 <h3><a name="H_1_1_5"></a>Implementation</h3>
 186
 187 The hardest part of implementing this router is the data organisation.  The
 188 arrangement of the data to minimise the number of operations required to follow
 189 a route from one node to another is much harder than designing the algorithm
 190 itself.
 191 <p>
 192 The final implementation uses a separate table for nodes, segments and ways.
 193 Each table individually is implemented as a C-language data structure that is
 194 written to disk by a program which parses the OpenStreetMap XML data file.  In
 195 the router these data structures are memory mapped so that the operating system
 196 handles the problems of loading the needed data blocks from disk.
 197 <p>
 198 Each node contains a latitude and longitude and they are sorted geographically
 199 so that converting a latitude and longitude coordinate to a node is fast as well
 200 as looking up the coordinate of a node.  The node also contains the location in
 201 the array of segments for the first segment that uses that node.
 202 <br>
 203 Each segment contains the location of the two nodes as well as the way that the
 204 segment came from.  The location of the next segment that uses one of the two
 205 nodes is also stored; the next segment for the other node is the following one
 206 in the array.  The length of the segment is also pre-computed and stored.
 207 <br>
 208 Each way has a name, a highway type, a list of allowed types of traffic, a speed
 209 limit, any weight, height, width or length restrictions and the highway
 210 properties.
 211 <p>
 212 The super-nodes are mixed in with the nodes and the super-segments are mixed in
 213 with the segments.  For the nodes they are the same as the normal nodes, so just
 214 a flag is needed to indicate that they are super.  The super-segments are in
 215 addition to the normal segments so they increase the database size (by about
 216 10%) and are also marked with a flag.
 217
 218 <h3><a name="H_1_1_6"></a>Practicalities</h3>
 219
 220 At the time of writing (April 2010) the OpenStreetMap data for Great Britain
 221 (taken from
 222 <a class="ext" href="http://download.geofabrik.de/osm/europe/" title="GeoFabrik Mirror of OpenStreetMap Data">GeoFabrik</a>
 223 ) contains:
 224 <ul>
 225   <li>14,675,098 nodes
 226   <ul>
 227     <li>8,767,521 are highway nodes
 228     <li>1,120,297 are super-nodes
 229   </ul>
 230   <li>1,876,822 ways
 231   <ul>
 232     <li>1,412,898 are highways
 233     <ul>
 234       <li>9,316,328 highway segments
 235       <li>1,641,009 are super-segments
 236     </ul>
 237   </ul>
 238   <li>60,572 relations
 239 </ul>
 240
 241 The database files when generated are 41.5 MB for nodes, 121.6 MB for segments
 242 and 12.6 MB for ways and are stored uncompressed.  By having at least 200 MB or
 243 RAM available the routing can be performed with no disk accesses (once the data
 244 has been read once).
 245
 246
 247 </div>
 248
 249 <!-- Content End -->
 250
 251 <!-- Footer Start -->
 252
 253 <div class="footer" align="center">
 254 <hr>
 255
 256 <address>
 257 &copy; Andrew M. Bishop = &lt;amb "at" gedanken.demon.co.uk&gt;
 258 </address>
 259
 260 </div>
 261
 262 <!-- Footer End -->
 263
 264 </BODY>
 265
 266 </HTML>