How to Improve our Global Chinese Official Finance Data

Improving The Data - Get Involved

If you have additional information for a project, or believe our current information is incorrect, please let us know! Email us your comment or suggestion at china@aiddata.org along with any additional sources of information you would like to submit for our consideration.

While AidData’s Chinese Official Finance database was built by our own staff and researchers, it has since then benefited from the input of dozens of independent contributors.

We’d like you to help us improve the data by identifying errors and omissions, and by suggesting alternative sources of information.

This public resource was created in anticipation of the fact that others who are knowledgeable about specific Chinese official finance activities would help improve the accuracy, scope and depth of the database over time.

AidData staff follow a specific set of procedures to review and approve suggested content (see Appendix A of the TUFF 1.3 Methodology for more details). If your contribution is approved, you will receive an email notification from our team to let you know that your comment has been integrated into the online record.

If you have any questions, you can email us at china@aiddata.org.

Measuring Data Improvement - Health of Record Scores

How we Measure the “Health” of a Project Record

The purpose of AidData’s “Health of Record” methodology is to rate the completeness and verifiability of each project record. The methodology produces a source triangulation and a field completeness score. Our team uses these scores to prioritize project records that require further investigation and validation; they can also be used by external users to isolate and analyze project records with varying levels of data quality.

The public disclosure of these data quality scores is part of a larger effort at AidData to be as transparent as possible about the data it produces through the Tracking Under-reported Financial Flows (TUFF) methodology. For more information on how you can help improve the “health” of a particular project record, please see the section titled, "Improving the Data".

Source Triangulation Score: This score, which varies from 0 to 20 (with higher scores representing better-sourced project records) is designed to capture the diversity and quality of sources and source types used to construct individual project records. These sources not only include those codified in the TUFF methodology (e.g. media reports, government documents, and scholarly articles), but also sources gained via ground-truthing efforts.

Base Score: The base score is determined by the number of media reports used to source a project.  It is informed by the actual distribution of sources in the database.

  • Projects receive 1 point for each additional media report (2 and above)
  • Points will be capped at 4 because of the diminishing value of additional media sources (due to repetition of information).

Value Added Score: This score awards extra points to project records that are sourced from other, more credible sources. Extra points are awarded for each source type that informs a project record. Project records do not receive additional points for more than one source within each category; rather, this score is used to assess the diversity of source types attached to a project record.

  • Official Government Sources (Donor/Recipient): 3
  • Other Official Sources (non-Donor/non-Recipient): 3
  • Implementing Agency Source: 2
  • Academic Journal Articles/Other Academic Sources: 2
  • NGO/Civil Society/Advocacy: 1
  • Social Media, including unofficial Blogs: 1

Bonus Points: Additional points are awarded for ground-truthed or sky-truthed projects. Evidence of such a procedure is found in multimedia content uploaded to the page of a project record.

  • Successfully ground-truthed: 4 points

Field Completeness Score: This score assesses a project record’s level of completeness (i.e. having all of its fields populated). It varies from 0 to 9; higher values represent project records with more populated fields. We prioritize the presence of 7 ‘key’ fields (defined below); if any of these fields are missing information, a project record’s completeness score is reduced by 1 point. Additionally, a project record earns an extra point when any ‘high-value’ field (defined below) is populated. In order to ensure that the field completeness score only assumes positive values, all project records start base value of 8 before deductions begin. The theoretical max of this score is therefore 9.

High value fields:

  • Transaction Amount: Projects with missing financial amounts will receive a 1 point deduction
  • Commitment Year: Project without a commitment year or tagged “year uncertain” will receive a 1 point deduction
  • Flow Class: “vague” records will receive 1 point deduction
  • Flow Type: Vague-TBD/Unset records will receive a 1 point deduction
  • Sector: Unallocated/Unspecific projects will receive a 1 point deduction

Status: To identify records that merit an additional round of searchers to see if new information is available, the completeness score will take status into account. It is reasonable to assume that completed or cancelled projects will not receive additional media coverage whereas pipeline, implementing, or suspended projects could receive additional coverage.

  • Projects that are marked as completed or cancelled will receive 1 point since we can be confident that additional information will not be forthcoming.
  • Projects that are marked pipeline or implementation’ receive 0 points.

Other fields:

  • Implementing/Accountable Agency: Projects without an implementing or accountable agency also lose a point.