Data20 May 2026

Retraction Watch: Analysing the 69,911-Record Database

The Retraction Watch database has grown from a few thousand records in 2010 to 69,911 in April 2026. This analysis breaks down what is in it, how the AMI uses it, and what the growth pattern indicates.

TL;DR

Retraction Watch database analysis as used in AMI v1.5: 69,911 total records (April 2026), 5,390 misconduct-linked. China leads attribution, followed by India and the US in absolute count. Database grew 14x from 2015 to 2024 reflecting both detection improvement and Crossref partnership.

Retraction Watchresearch misconductdatabaseD6data

TL;DR

Retraction Watch database analysis: 69,911 total records (April 2026), 5,390 misconduct-linked. Growth from ~5,000 in 2010 to 70,000+ in 2024 reflects both detection improvement and the 2023 Crossref partnership. China leads absolute count; per-publication normalisation shifts rankings.

The database

Retraction Watch maintains the world's largest systematic catalogue of scientific retractions. The database contains structured records of:

Authors and institutional affiliations
Country attribution (proportional for multi-country papers)
Journal and publisher
Original publication and retraction dates
Retraction reason codes
Notice text

The data is publicly available via the Crossref/GitLab partnership since 2023.

Total volume — 69,911 records (April 2026)

The full database includes all retractions Retraction Watch has catalogued, across all reasons:

Reason category	Approximate share
Misconduct (fabrication, falsification, fraud, manipulation)	~7.7% (5,390)
Plagiarism (in research context)	~10–15%
Honest errors	~25–30%
Duplicate publication	~15–20%
Ethics issues (consent, approvals)	~5–10%
Withdrawal at author request	~5–10%
Other / multiple / unclear	~15–20%

The AMI's D6 dimension uses only the misconduct-linked subset (the first row above). The other categories are excluded.

Growth pattern

The database has grown substantially:

Year	Approx. records
2010	~5,000
2015	~10,000
2020	~30,000
2022	~45,000
2024	~65,000
April 2026	69,911

What drove the growth

Pre-2020: gradual growth as Retraction Watch's coverage expanded and more retractions accumulated over time. Real underlying retraction events were growing too as journals improved post-publication review.

2020–2022: paper mill detection efforts. Several journals (notably PLOS, Wiley journals, Hindawi) ran systematic retraction campaigns identifying paper mill-produced content. Each campaign added thousands of records.

2023 Crossref partnership: the database was made openly available through Crossref. Coverage gaps were filled; record completeness improved.

2024–2025: continued paper mill batch retractions. Wiley/Hindawi retractions added several thousand records.

Country attribution

Absolute counts

By raw count, China leads — it produces the most papers globally, so high absolute count is partly a scale effect.

Country	Approx. absolute count
China	Largest
India	Second
US	Third
Iran	Fourth
Italy	Fifth
Russia	Sixth

These rough rankings reflect approximate database counts; exact figures shift as the database updates.

Per-publication rate (the AMI methodology)

Dividing by total publications from OpenAlex normalises for scale. The AMI's D6 dimension uses per-publication rates rescaled to 0–100:

Country	D6	Position
China	100	Top
Russia	78	High
India	70	High
Iran	65	High
Pakistan	65	High

China still leads after normalisation but the gap closes. Russia and Iran are disproportionately high once you account for their smaller absolute publication volume.

What the 5,390 misconduct-linked records show

Fabrication and falsification

The clearest cases. Identifiable through statistical inconsistency, replication failure, or whistleblower reports. Approximately half of the 5,390 misconduct records are in this category.

Image manipulation

Particularly common in biomedical research. Specialised image forensics tools (Imagetwin, PaperWatcher) have driven a wave of image manipulation detection 2020–2024.

Plagiarism in research

Direct copying in research papers. Less common in modern publications than in older retracted papers; detection is now stronger pre-publication.

Manipulation of peer review

A growing category. Authors recommending fake reviewers, then those "reviewers" recommending acceptance. Several major journals have run retraction campaigns for these cases.

Paper mills

The largest single-cause category in the 2020–2024 growth. Paper mills sell ready-made papers to authors needing publications for career progression. Batch retractions are characteristic.

Specific famous cases in the database

The database includes individual records for the major cases discussed in academic integrity literature:

Diederik Stapel (Netherlands, 2011) — dozens of records, mostly social psychology
Hwang Woo-suk (South Korea, 2005–2006) — stem cell research retractions
STAP cells / Haruko Obokata (Japan, 2014) — Nature retractions
Marc Hauser (US, 2010) — cognitive psychology
Macchiarini (Sweden, 2014–2016) — trachea transplant research
Wansink (US, 2018–2020) — food behaviour research [verify]

Each case is searchable in the Crossref/GitLab interface.

How the AMI uses the data

The AMI methodology:

Filter to misconduct-linked retractions (the 5,390 subset)
Country-attribute via author affiliation (proportional for multi-country)
Divide by OpenAlex publication counts for matching country and time period
Calculate retractions per 10,000 publications
Rescale to 0–100 across the 39-country set

The result is each country's D6 dimension score.

Limitations

Detection-incidence confound

Retractions measure what gets caught. Actual undetected fabrication is missing from the database.

Country attribution complexity

Multi-country papers require proportional attribution; methodology choices affect specific scores.

Reason coding inconsistency

Retraction notices use widely varying language; classification into misconduct categories requires interpretation by Retraction Watch staff.

Lag

Retractions often happen years after publication. Recent fabrication is under-represented.

Sources

Retraction Watch Database on GitLab
AMI v1.5 methodology document
OpenAlex publication count data
Fang, Steen & Casadevall (2012), PNAS

Full methodology | Download dataset

Explore the full dataset

Frequently asked questions

How many retraction records are in the Retraction Watch database?

As of April 2026, the Retraction Watch database contains 69,911 retraction records. Of these, 5,390 are classified as misconduct-linked (fabrication, falsification, image manipulation, fraud). The remainder are honest-error retractions, duplicate publications, ethics issues, and other non-misconduct categories. The AMI's D6 dimension uses only the misconduct-linked subset.

Which countries have the most retractions?

In absolute count, China leads, followed by India and the US. After normalising by publication volume (retractions per 10,000 papers), the ranking shifts — China still leads but Russia, Iran, Egypt, and Pakistan move up the per-publication rankings. The AMI's D6 dimension uses the normalised rates, not absolute counts.

Why has the Retraction Watch database grown so rapidly?

Three factors. First, actual misconduct detection has improved through tools, peer review developments, and post-publication review platforms like PubPeer. Second, the 2023 Crossref partnership made the database substantially more accessible and complete. Third, systematic paper mill detection efforts have accelerated retraction processing — large clusters of paper mill papers being retracted in batches.

How to cite this article

APA: Booth, F. (2026). Retraction Watch: Analysing the 69,911-Record Database. Academic Misconduct Index. https://academicmisconductindex.com/blog/retraction-watch-69911-records

BibTeX: @misc{booth2026retraction, author={Booth, Francisco}, title={Retraction Watch: Analysing the 69,911-Record Database}, year={2026}, url={https://academicmisconductindex.com/blog/retraction-watch-69911-records}}

Francisco Booth

Independent researcher, founder of the Academic Misconduct Index

Data

Academic Plagiarism Statistics by Country 2026

Data

ChatGPT and Academic Cheating: What the Data Actually Shows

Guide

What Is Data Fabrication in Research? Definition and Famous Cases

← Back to all posts

Retraction Watch: Analysing the 69,911-Record Database

TL;DR

The database

Total volume — 69,911 records (April 2026)

Growth pattern

What drove the growth

Country attribution

Absolute counts

Per-publication rate (the AMI methodology)

What the 5,390 misconduct-linked records show

Fabrication and falsification

Image manipulation

Plagiarism in research

Manipulation of peer review

Paper mills

Specific famous cases in the database

How the AMI uses the data

Limitations

Detection-incidence confound

Country attribution complexity

Reason coding inconsistency

Lag

Sources

Related

Frequently asked questions

How many retraction records are in the Retraction Watch database?

Which countries have the most retractions?

Why has the Retraction Watch database grown so rapidly?