About Our Data Products
AidData engages in a large variety of data collection and value addition activities to publish an expanding suite of data products.
For a searchable visualization of AidData's core data, please see the Dashboard.
For AidData's latest core Research Release, which contains a downloadable snapshot of our entire database, please visit our Research Release Datasets.
For our datasets with aid and project data that is geocoded at the local level, please visit our Sub-National, Geospatial Research Datasets.
For datasets on GCC and Chinese development finance, please visit our Donor Datasets.
How We Collect and Categorize Data
Data collection takes place along two distinct product lines:
- Aggregate Data, data represented as a single, composite value
- Project-Level Data, data consisting of distinct, project activities with accompanying project information
AidData's data is sourced from various locations, including but not limited to:
- the OECD-DAC's Creditor Reporting System (CRS)
- Donor systems, such as official data from individual donor governments
- Recipient systems, such as Aid Information Management Systems
We add value to this data through our:
- Activity and purpose coding
- Geocoding, to produce our expanding repository of sub-national, geospatial research datasets
- Data curating, by linking, de-duplicating and presenting project records gathered from different sources or data collection activities
- Data quality assurance, by standardizing and verifying data according to standard AidData practice
AidData also collects official project-level data on the development finance activities of non-OECD bilateral and multilateral donors, as well as data on Chinese development finance, collected using AidData's unique Tracking Under-reported Financial Flows (TUFF) methodology.
Please read AidData's Data Management Plan (DMP) for more descriptions of each of these activities and products. The DMP details the full range of data products that AidData generates and the underlying processes used for each product, as well as provides a framework for upcoming data projects and releases.
For any questions or concerns about AidData's data, please email email@example.com.
Data Product Processing Levels
AidData's data products are built using hierarchical data processing levels, numbering 1-4 (with sub-processing Levels like Level 1a). Each processing level is built upon the previous level (meaning the Level 1 product is derived from the Level 0 product and so on). Table 1 below lists the processing levels used by AidData in data production, with their notional meanings and descriptions.
AidData Data Level
Minimally processed raw data to Level 0 field names and table structure per the product line, converted to UTF-8. This product is for internal use only. This data will never be part of a public release.
Data post processed to Level 1 field names; Data decomposed to proper table structure by the product definition; financials deflated to base years where possible. Sectors crosswalked to aiddata sectors. Geocoded data is spatially scrubbed (verified to be in the correct boundary). Fields checked for intra and inter field consistency. Level 1 stats generated. Quality assurance (QA) flags added. This Level may include ancillary data included in the release, but is not quality assured by AidData. This is the first data product which is available as a public release.
A single table, denormalized version of the Level 1 product, with provided assumptions. This is a ‘joined’ product of Level 1 constituent tables or data.
Level 1 + any quality assured ancillary data (e.g. evaluation data); Aggregates and rollups of data by a data dimension (e.g. by Sector or Donor)
Rasterizations - Continuous surface representations of our Aid Information. For geocoded data, a geospatial representation of (a) the total dollars of aid we estimate are at a given location, and (b) the number of projects. This product additionally provides at least one surface of the uncertainty in our continuous estimates.
Simulated Products - These products are further refined versions of the Level 2 product line. For example, a product might produce a continuous surface estimate of aid projects that is weighted to account for slope, population, or road networks. These products will also provide a surface of uncertainty.