AMI
Data

Retraction Watch: Analysing the 69,911-Record Database

The Retraction Watch database has grown from a few thousand records in 2010 to 69,911 in April 2026. This analysis breaks down what is in it, how the AMI uses it, and what the growth pattern indicates.

TL;DR

Retraction Watch database analysis as used in AMI v1.5: 69,911 total records (April 2026), 5,390 misconduct-linked. China leads attribution, followed by India and the US in absolute count. Database grew 14x from 2015 to 2024 reflecting both detection improvement and Crossref partnership.

Retraction Watchresearch misconductdatabaseD6data

TL;DR

Retraction Watch database analysis: 69,911 total records (April 2026), 5,390 misconduct-linked. Growth from ~5,000 in 2010 to 70,000+ in 2024 reflects both detection improvement and the 2023 Crossref partnership. China leads absolute count; per-publication normalisation shifts rankings.

The database

Retraction Watch maintains the world's largest systematic catalogue of scientific retractions. The database contains structured records of:

  • Authors and institutional affiliations
  • Country attribution (proportional for multi-country papers)
  • Journal and publisher
  • Original publication and retraction dates
  • Retraction reason codes
  • Notice text

The data is publicly available via the Crossref/GitLab partnership since 2023.

Total volume — 69,911 records (April 2026)

The full database includes all retractions Retraction Watch has catalogued, across all reasons:

Reason categoryApproximate share
Misconduct (fabrication, falsification, fraud, manipulation)~7.7% (5,390)
Plagiarism (in research context)~10–15%
Honest errors~25–30%
Duplicate publication~15–20%
Ethics issues (consent, approvals)~5–10%
Withdrawal at author request~5–10%
Other / multiple / unclear~15–20%

The AMI's D6 dimension uses only the misconduct-linked subset (the first row above). The other categories are excluded.

Growth pattern

The database has grown substantially:

YearApprox. records
2010~5,000
2015~10,000
2020~30,000
2022~45,000
2024~65,000
April 202669,911

What drove the growth

Pre-2020: gradual growth as Retraction Watch's coverage expanded and more retractions accumulated over time. Real underlying retraction events were growing too as journals improved post-publication review.

2020–2022: paper mill detection efforts. Several journals (notably PLOS, Wiley journals, Hindawi) ran systematic retraction campaigns identifying paper mill-produced content. Each campaign added thousands of records.

2023 Crossref partnership: the database was made openly available through Crossref. Coverage gaps were filled; record completeness improved.

2024–2025: continued paper mill batch retractions. Wiley/Hindawi retractions added several thousand records.

Country attribution

Absolute counts

By raw count, China leads — it produces the most papers globally, so high absolute count is partly a scale effect.

CountryApprox. absolute count
ChinaLargest
IndiaSecond
USThird
IranFourth
ItalyFifth
RussiaSixth

These rough rankings reflect approximate database counts; exact figures shift as the database updates.

Per-publication rate (the AMI methodology)

Dividing by total publications from OpenAlex normalises for scale. The AMI's D6 dimension uses per-publication rates rescaled to 0–100:

CountryD6Position
China100Top
Russia78High
India70High
Iran65High
Pakistan65High

China still leads after normalisation but the gap closes. Russia and Iran are disproportionately high once you account for their smaller absolute publication volume.

What the 5,390 misconduct-linked records show

Fabrication and falsification

The clearest cases. Identifiable through statistical inconsistency, replication failure, or whistleblower reports. Approximately half of the 5,390 misconduct records are in this category.

Image manipulation

Particularly common in biomedical research. Specialised image forensics tools (Imagetwin, PaperWatcher) have driven a wave of image manipulation detection 2020–2024.

Plagiarism in research

Direct copying in research papers. Less common in modern publications than in older retracted papers; detection is now stronger pre-publication.

Manipulation of peer review

A growing category. Authors recommending fake reviewers, then those "reviewers" recommending acceptance. Several major journals have run retraction campaigns for these cases.

Paper mills

The largest single-cause category in the 2020–2024 growth. Paper mills sell ready-made papers to authors needing publications for career progression. Batch retractions are characteristic.

Specific famous cases in the database

The database includes individual records for the major cases discussed in academic integrity literature:

  • Diederik Stapel (Netherlands, 2011) — dozens of records, mostly social psychology
  • Hwang Woo-suk (South Korea, 2005–2006) — stem cell research retractions
  • STAP cells / Haruko Obokata (Japan, 2014) — Nature retractions
  • Marc Hauser (US, 2010) — cognitive psychology
  • Macchiarini (Sweden, 2014–2016) — trachea transplant research
  • Wansink (US, 2018–2020) — food behaviour research [verify]

Each case is searchable in the Crossref/GitLab interface.

How the AMI uses the data

The AMI methodology:

  1. Filter to misconduct-linked retractions (the 5,390 subset)
  2. Country-attribute via author affiliation (proportional for multi-country)
  3. Divide by OpenAlex publication counts for matching country and time period
  4. Calculate retractions per 10,000 publications
  5. Rescale to 0–100 across the 39-country set

The result is each country's D6 dimension score.

Limitations

Detection-incidence confound

Retractions measure what gets caught. Actual undetected fabrication is missing from the database.

Country attribution complexity

Multi-country papers require proportional attribution; methodology choices affect specific scores.

Reason coding inconsistency

Retraction notices use widely varying language; classification into misconduct categories requires interpretation by Retraction Watch staff.

Lag

Retractions often happen years after publication. Recent fabrication is under-represented.

Sources

Full methodology | Download dataset

Related

Explore the full dataset

Frequently asked questions

How many retraction records are in the Retraction Watch database?

As of April 2026, the Retraction Watch database contains 69,911 retraction records. Of these, 5,390 are classified as misconduct-linked (fabrication, falsification, image manipulation, fraud). The remainder are honest-error retractions, duplicate publications, ethics issues, and other non-misconduct categories. The AMI's D6 dimension uses only the misconduct-linked subset.

Which countries have the most retractions?

In absolute count, China leads, followed by India and the US. After normalising by publication volume (retractions per 10,000 papers), the ranking shifts — China still leads but Russia, Iran, Egypt, and Pakistan move up the per-publication rankings. The AMI's D6 dimension uses the normalised rates, not absolute counts.

Why has the Retraction Watch database grown so rapidly?

Three factors. First, actual misconduct detection has improved through tools, peer review developments, and post-publication review platforms like PubPeer. Second, the 2023 Crossref partnership made the database substantially more accessible and complete. Third, systematic paper mill detection efforts have accelerated retraction processing — large clusters of paper mill papers being retracted in batches.

How to cite this article

APA: Booth, F. (2026). Retraction Watch: Analysing the 69,911-Record Database. Academic Misconduct Index. https://academicmisconductindex.com/blog/retraction-watch-69911-records

BibTeX: @misc{booth2026retraction, author={Booth, Francisco}, title={Retraction Watch: Analysing the 69,911-Record Database}, year={2026}, url={https://academicmisconductindex.com/blog/retraction-watch-69911-records}}

FB

Francisco Booth

Independent researcher, founder of the Academic Misconduct Index