Discussion of Public Participation Online Trail Databases

New - The details of the network algorithm and other issues related to trail databases are published in the 2004 JCDL conference proceedings. The paper is also available from TopoFusion and is entitled Digital Trail Libraries (5 MB, pdf format).

There is at present a growing need for current and accurate trail information. Those who visit natural areas know that good trail information is frequently scarce. Maps grow quickly out of date as trails are closed or re-routed. Often the land managers themselves have little or no trail data. There is also no standard for trail data.

With the advent of inexpensive consumer level GPS technology, a new possibility for collection of current and accurate trail data has presented itself. What better source of data than those who are actually using the trails? There are several advantages to this approach, foremost of which is the advantage in the accuracy of the GPS data itself.

We consider a central database of user submitted GPS data. GPS data, in this case, consists of GPS track logs. A track log is a single trip recorded in the field by a trail user. The GPS device maintains a position lock throughout the trip and records positions information (a bread crumb trail) at intervals. The GPS track is then downloaded to a computer, optionally cleaned up by hand, and submitted to the database.

Allowing user submissions of GPS data gives rise to a few major problems. First, different trail users can submit the same trails. This could be avoided by simply not allowing duplicate submissions, requiring the user or the database manager (a person) to enforce this. However, we will see how duplicate submissions can actually be of benefit.

User tracks can also overlap other tracks only partially. Imagine that a user travels ten miles on trails that are already found in the database but then branches off to a trail not in the database. This again could be taken care of by not allowing duplicates. But users are interested in logging their entire trip, not just a new portion. Requiring them to hand edit out portions is also undesirable. Instead, we would like an automatic method for combining GPS tracks which detects and eliminates duplicate tracks. The problem, then, goes from representing single tracks (a single trip) to representing whole networks of trails.

Another issue is one of data reliability. If anyone is allowed to submit data there is nothing to prevent submission of bogus tracks. Since it is impossible to verify whether trails submitted actually exist, some sort of reliability measure must be taken into account. Here is one of the reasons that multiple submissions of the same trail is desirable. Each trail segment can be annotated with the number of submitted tracks that cover this segment. Segments with a high number of (unique user) submissions can most likely be relied upon. Further, a user ranking system could also be employed, so that submitters of erroneous data can be penalized.

The other advantage of obtaining multiple submissions for identical trail segments lies in the accuracy of the data. GPS errors can be corrected for by averaging the segments together. The result will be a better representation of trails. Our hope is to obtain accuracy results equivalent to expensive DGPS (differential GPS) by averaging consumer level GPS tracks together.

Therefore, it is our view that developing automatic methods for collecting and classifying user submitted trail data is essential. One of the primary reasons for creating TopoFusion is to serve as a testbed for network GPS algorithms.

We are working on solving these and other GPS data related problems. Specifically, we have designed and implemented an algorithm that, given multiple GPS tracks as input, will produce a network (a graph) that represents each unique trail as a single trail segment. It also averages tracks together to produce a more accurate description of the trail(s).

To see and play with our GPS network algorithm, check out the demonstration version of TopoFusion. It's available free.

For example networks produced by the algorithm, see our GPX Networks. The networks are available in GPX format, an open standard for exchange of GPS data.

Check out, a user-submitted database of hiking trails. Geoff has done some great work on a web interface to detailed trail data, all in GPX format! He uses some heuristics to separate trails at there intersection points so that new trips can be planned, similar to TopoFusion's networks.

Also check out a nice site incorporating user submission of trails for Conneticut. is one of the best examples of user submitted, manually edited trail databases. It's going to be great to see this collection of data flourish.

And, Water Trails - User-edited water trails for kayaks and canoes in the San Francisco Bay Area

Application to Recreation Simulation

Besides being of benefit to trail users and land managers, we hope such a database of recreational trails will be useful for trail user simulation, conflict management and resource allocation. Agent based modeling approaches to recreation management are becoming increasingly popular. A key problem to applying trail user simulation in a widespread manner is the lack of accurate and current trail data. Often this data must be collected and classified by hand. A database of submitted trails would solve this. Modelers would also have the advantage of usage statistics on trails. Further, precise speed and elevation data can be extracted for use in the simulation.

Herein lies another advantage of the trail network algorithms. Agent based simulations require a network of trails--not unconnected trail segments. The topology of the network must be present, which is exactly what a GPS network algorithm will produce.

We are currently implementing agent based modeling of trail users in TopoFusion. For more information on this part of the project, please see Trail Simulation.

Contact Us

We are intersted in hearing from anyone interested in these or related problems. Please contact us at: