The GeoBoundaries team at AidData has now made over 600 sets of administrative boundaries freely available for 196 countries across the globe.
Brave explorer and want to jump straight to the data? Check out our early alpha soft launch at: http://geoquery.org/geoboundaries/.
Every Sunday, nine students pile into a tiny lab in Williamsburg, Virginia and begin working. One may be emailing the Venezuelan government, another taking a phone call with a data scientist in Estonia, and others analyzing the data licensing for an organization in Madagascar. These students are part of AidData’s GeoBoundaries team—a group dedicated to collecting accurate, open source data on administrative boundaries around the globe.
Students working on GeoBoundaries come from a variety of majors—including international relations, data science, and biology—and have experience ranging from field work in developing nations to internships working with the United Nations. What they all have in common is a desire to make spatial data more accessible and easier to use for everyone: whether the user is a data scientist using machine learning to analyze terabytes of spatial data, a project manager mapping new health clinics, or a researcher exploring conflict trends across Africa.
A key lesson these students are learning, and what drives the need for GeoBoundaries, is the importance of open source data. To date, few collections of global administrative boundary data exist. Those that do are either outdated and infrequently updated, or not freely redistributable (or both). Even finding accurate and open source boundary data for individual countries can be difficult. Some countries have well-maintained, open source data portals with the latest data easily accessible, while others require extensive searching only to find that the best options are years old and cannot be redistributed.
As spatial data continues to expand and plays a growing role in how we understand and make decisions about the world around us, the need for reliable and open source data will be paramount. By removing the burden of finding and making sense of critical administrative boundary data, AidData and the GeoBoundaries team hope to empower a broad range of data users across disciplines to produce new and meaningful research and insights.
As of this week, the GeoBoundaries team has collected over 600 sets of administrative boundaries from around the world, with nearly complete coverage at the ADM0 and ADM1 levels, and substantial coverage at the ADM2 level. Efforts have been primarily focused on compiling as complete a collection as possible up the ADM2 level before fully focusing efforts on finer scale data (ADM3+), though work is already well underway for Africa, as it has been a considerable focus for ongoing research at AidData as well as for work done by many other researchers.
Having formed initially out of a need to support work at AidData, GeoBoundaries focuses heavily on developing nations. Relative to finding data for developed nations, this can be a huge challenge. After exhausting all avenues of searching for data online using existing and open sources, the team typically expands their search by reaching out to governments, NGOs, and other international organizations that may have collected or have access to boundary data. After finding and assessing a new set of boundary data, along with securing permission to make the data freely available, the team repackages it into a standardized GeoBoundaries format.
The result of hundreds of phone calls, thousands of emails, and countless hours scouring the web for files, is a collection of boundary data for Africa consisting of 53 ADM0, 52 ADM1, 45 ADM2, and 16 ADM4 boundaries for 53 countries in Africa.
This collection of open boundary data is a critical component of advancing research and development efforts related to low and middle income countries where reliable and open data can be hard to come by. Just as importantly, users can reallocate the time they would have spent searching for boundary data to actually doing their research.
Editor’s note: This post was originally published on GeoQuery’s Revolutions Blog, and is republished here with the authors’ permission. Statistics for this article were computed by Rachel Oberman. Maps are by Leigh Seitz, John Napoli, Josh Panganiban, Graham Melville, Grace Grimsley, and Lauren Hobbs.
Rachel Oberman is the GeoQuery student lab director and leads all undergraduates on the lab teams, including the GeoBoundaries, GeoData, and GeoDev teams. Rachel is interested in pursuing a career in Computer Science and Data Science, and plans to graduate from William & Mary in May 2020.
Seth Goodman is a Data Engineer at AidData, and a PhD Candidate in Applied Science at William & Mary.
The views expressed here are those of the authors alone, and do not necessarily reflect the views of the institutions to which the authors belong.