FAQs About Our Data
Each year, billions of dollars are spent to improve the lives of citizens in developing countries. With accessible and relevant data at their fingertips, governments can make better decisions to plan for their country’s future, citizens can hold their leaders to account for providing public goods, and donors can invest aid dollars to maximize development results. Aiddata.org is a one-stop shop for anyone to download, visualize, and analyze data on $40 trillion in financing for development.
AidData integrates detailed project-level data on $6 trillion in aid from over 90 donor agencies with information on remittances and foreign direct investment. We are currently undertaking an initiative to geocode these projects – applying precise geographic coordinates to development activities – so that the data can transformed into intuitive maps and analyzed at the subnational level.
Frequently Asked Questions (FAQs)
1. What types of data are available via aiddata.org?
AidData's core 3.0 database includes geocoded data from a variety of country aid information management systems (AIMS), donor IATI feeds, and open data initiatives like the World Bank’s Mapping for Results. Access these data through the AidData GIS Portal, where they can be combined, filtered, and overlaid with other layers of geographic data. Geocoded data can also be accessed through the AidData API, for use in external applications or visualizations, or can be exported to .csv or IATI .xml for use in research or analysis. AidData will be adding new geocoded data for many more countries and donors in the coming years.
In addition to providing a searchable database of more than 1 million aid activities from the 1940s to present, AidData provides access to several supplemental datasets containing useful information that has not yet been included in our core database. The sources of these supplemental datasets vary and include: replication datasets, donor datasets and results monitoring datasets. Some datasets provide a combination of information retrieved by AidData from the donor itself (e.g. financial details or descriptive information) and additional sector, activity, or geographical codes applied by AidData staff.
Supplemental datasets are not necessarily comparable to those in the main AidData database or to each other and may not be formatted per accepted international standards, such as that of the CRS or IATI. Users should be aware that these datasets are intended for isolated use and should not be combined with each other, nor should they be used as definitive sources of information. If you have a question about the appropriate use of a particular supplemental dataset, contact us at email@example.com or on our #aiddata Freenode channel.
2. How does AidData import data from the OECD CRS?
The Organisation for Economic Co-operation and Development (OECD) Creditor Reporting System (CRS) publishes two forms of donor-reported data: 1) project/transaction level data; and 2) aggregate level data. AidData imports the project/transaction level data and not the aggregate level data. AidData first imported project/transaction level OECD data with the 2002 OECD data release. At this time, the OECD published its aid information on CDs and the information was uploaded directly from the CDs into AidData’s database. The OECD no longer provides physical CDs and AidData now imports data using the OECD CRS txt files from the Bulk Download page. Information obtained from this page is equivalent to the data previously available on the physical CDs.
When the OECD publishes its yearly data, corrections are made to past records such that historical data may not be comparable across different OECD releases. These corrections happen for two main reasons. First, a donor can provide updated information about a project in any past reporting year. Second, the OECD may remove historical information for past aid recipient countries that become OECD donors. For example, past OECD releases document projects from Germany to Poland. Once Poland became a member of the OECD, these past recipient projects were expunged from the official OECD CRS data.
AidData takes an alternative approach to uploading the OECD CRS data rather than removing all past OECD data and importing the new release every year. Using the 2002 OECD data release, AidData imported all of the historical data up to that year. For subsequent years, only the new year’s worth of data is imported into the dashboard. Data from 1946-2002 comes from the 2002 OECD data CD. However, when the 2003 data was released, only that year’s data was imported into AidData’s database and added to the 1946-2002 data already available. Although the OECD may make changes to the historical data, these changes are not reflected in AidData’s database. This “accretion” model leads to differences between the OECD’s data and AidData’s reporting of the OECD data.
3. When does the OECD CRS release their data?
Each year of Organisation for Economic Co-operation and Development (OECD) Creditor Reporting System (CRS) data is released with a two-year lag and in three different iterations. For example, 2012 OECD CRS data is released in January, April, and June of 2014. AidData has always imported the January release. For example, for release year 2011, AidData imported the OECD CRS data released in January 2013. For 2012, AidData imported the data released in January 2014.
4. What supplementary information does AidData provide for data from the OECD CRS?
The AidData purpose and activity coding scheme provides added granularity for project-level data from any source, including the Organisation for Economic Co-operation and Development (OECD) Creditor Reporting System (CRS), using reported descriptive information. AidData coders utilize the title and descriptive fields to assign one purpose code for each project (from a nearly identical list to the OECD CRS “main codes”) and as many activity codes as necessary to capture the project’s individual activities. Activity coding provides a standardized mechanism to identify the full range of activities undertaken in a given project as well as specific activities undertaken in projects across sectors. For a more thorough description of the activity coding methodology and to view the codebook: http://aiddata.org/user-guide.
Note: Activity codes are not currently available for all projects in all years. A search by activity code on the dashboard will not provide a comprehensive picture of aid activities.
5. Does AidData assign financial amounts to individual activity codes?
AidData does not make assumptions about the division of financial amounts when a transaction has more than one activity code. Individual users determine the best approach for estimating financial allocation across codes. Many researchers choose to divide the total transaction amount equally across all activities. Examples using that approach and others can be found on the publications page.
6. Can AidData’s keyword search feature be used to find projects on a specific topic for which AidData does not currently have an activity code or to supplement projects for which there is little data?
Using AidData’s keyword search feature should not be used as a replacement for activity codes. Project descriptions can include extraneous contextual information that does not correspond with a funded activity. While AidData activity coders are trained to ignore this extraneous information, a keyword search will not reflect this nuance and will likely produce false positives. If your work requires a high degree of coding accuracy, AidData recommends that you do not rely on the keyword search feature. A broad set of purpose codes and/or activity codes can serve as an initial filter, though please note that activity codes are not universally available and will not produce a comprehensive dataset. If it is necessary to supplement a purpose or activity code search, a single-keyword search for each relevant keyword or a search across title only using exported data is less likely to produce false positives.
7. What is the difference between a commitment and a disbursement? Which flow types does AidData’s data include for donors that report to the OECD CRS?
Aid flows are reported as both commitments and disbursements. As defined by the Organisation for Economic Co-operation and Development (OECD) Creditor Reporting System (CRS):
Total commitments per year comprise new undertakings entered in the year in question (regardless of when disbursements are expected) and additions to agreements made in earlier years.
A disbursement is the placement of resources at the disposal of a recipient country or agency, or in the case of internal development-related expenditures, the outlay of funds by the official sector. It can take several years to disburse a commitment. (Source: http://www.oecd.org/dac/stats/crsguide.htm)
In a given year of data, the OECD CRS data includes all reported commitment and disbursement transactions. Some projects may have both a commitment and disbursement. While commitments are generally tied to projects originating in that year, disbursements can be tied to projects originating in any year. For example, 2012 data can include disbursements of projects originating in 2012 or any prior year. For this reason, the full value of a project can be difficult to track through disbursements. Users should also be careful when aggregating project-level financial amounts to include only one flow type to avoid double counting.
AidData currently only imports transactions with commitments -- either commitment only or commitment + disbursement -- from the OECD. Disbursement only data must be sourced directly from the OECD. This difference may explain discrepancies in the number of transactions or total dollar amounts found when comparing yearly data between AidData and the OECD CRS. Beginning with 2013 data (which will be available in 2015), AidData will import all transactions from the OECD CRS.
Non-DAC and geocoded data include all financial flows obtained from the original data source -- commitment, disbursement, or commitment + disbursement. Again, users should be careful when aggregating project-level financial amounts to include only one flow type to avoid double counting.
8. What is included in the AidData default option on the dashboard? How does this differ from the aid information drawn directly from the OECD CRS?
AidData’s default selection on the dashboard (a customizable data query interface) draws from a curated dataset that we believe to be the richest set of information for a given donor in a given year. For donors that belong to the Organisation for Economic Co-operation and Development (OECD) Development Assistance Committe (DAC), more detailed information may be available on donor or third party websites (“web-scraping”) or from print publications, such as donor annual reports. The richest data source is determined individually for each donor and for each year and is identified in any exported dataset in the column titled “Source.”
The AidData default selection also includes data from non-DAC donors. Non-DAC donors that do not report to the OECD’s Creditor Reporting System (CRS) are changing the development finance landscape, though estimates of the share of all Official Development Assistance (ODA) vary widely. Please visit this webpage for more information on methodologies for gathering data on non-DAC development finance. Please visit the donor datasets webpage for more information on specific non-DAC donors included in the dashboard and the research release. The results of AidData’s efforts to track Chinese aid are not available through the dashboard or the research release but can be explored separately at china.aiddata.org.
For both DAC and non-DAC donors, AidData has gathered some aid information that cannot be standardized and included in the core database. These stand-alone datasets can be found here.
AidData does not geocode data imported from the OECD CRS. To learn more about the goespatial information included in the dashboard, please see below.
9. What is the AidData Research Release? How does it differ from the Advanced Search dashboard?
AidData’s Research Release is a static snapshot of AidData’s project-level database at a specific point in time. It does not include any geospatial data. The 2.1 release is a snapshot from February 2012. OECD data, web-scraped data for the World Bank, and some non-DAC data is included in the 2.1 release. A list of other donor datasets, including non-DAC data, and whether or not they are included in the dashboard can be found here. Transaction amounts are deflated to constant 2009 US dollars in the static release whereas the AidData dynamic dashboard is deflated to constant 2011 US dollars. This dataset has not undergone AidData’s rigorous QA process but is a reflection of the publicly accessible OECD data and other donor data. Only basic revisions have been made to the data to ensure total project numbers match what has been reported. In an effort to bring more data into the database, the portal is continually updated on a monthly basis. As such, the static Research Release will only reflect data held within the database at a particular point in time and is useful for scholars and other users as it allows for stable and replicable results.
10. What geospatial data does AidData offer?
AidData produces geospatial datasets that provide subnational information about development finance projects. For country specific geocoded datasets (e.g., Uganda, Nepal), the information is obtained from a given country’s Aid Information Management System (AIMS). More information about the AIMS and the Aid Management Fellows (AMF) that help collect this information can be found here. Donors typically report development finance information to partner governments through an AIMS system. Projects with geographical information are then coded by AidData Research Assistants and a static version of the dataset is made available on the AidData website. In addition, AidData also produces subnationally geocoded donor datasets (e.g., World Bank, African Development Bank). The underlying Information used to generate these datasets is usually obtained directly from that particular donor organization. These geocoded datasets are also available through the Advanced Search dashboard when users select the “Aid Management Systems” source. AidData’s geocoding methodology can be found here.
11. Why is the geospatial AIMS data in the Advanced Search dashboard different than what is available in the static releases?
Geospatial data obtained from the dashboard may be different than the data in the static releases. Data obtained from the Aid Information Management System (AIMS) is first geocoded and then undergoes a rigorous Quality Assurance (QA) process to ensure accurate project location information, logical dates, consistent transaction information, and deflation to 2011. Once the process is complete, this static version of the AIMS data is made available on the geocoded datasets page. However, the geospatial data in the dashboard is a reflection of what was reported directly to the AIMS without undergoing the QA process. Thus, the static dataset and the data available in the dynamic dashboard will not necessarily match. In the future, efforts will be made to import the static release into the Advanced Search dashboard, which will result in the two platforms yielding the same data.
AidData's Data Management Plan
AidData's Data Management Plan (DMP) details the full range of data products that AidData generates and the underlying processes used for each product, as well as provides a framework for upcoming data projects and releases.