Retraction Watch: Analysing the 69,911-Record Database
The Retraction Watch database has grown from a few thousand records in 2010 to 69,911 in April 2026. This analysis breaks down what is in it, how the AMI uses it, and what the growth pattern indicates.
TL;DR
Retraction Watch database analysis as used in AMI v1.5: 69,911 total records (April 2026), 5,390 misconduct-linked. China leads attribution, followed by India and the US in absolute count. Database grew 14x from 2015 to 2024 reflecting both detection improvement and Crossref partnership.
TL;DR
Retraction Watch database analysis: 69,911 total records (April 2026), 5,390 misconduct-linked. Growth from ~5,000 in 2010 to 70,000+ in 2024 reflects both detection improvement and the 2023 Crossref partnership. China leads absolute count; per-publication normalisation shifts rankings.
The database
Retraction Watch maintains the world's largest systematic catalogue of scientific retractions. The database contains structured records of:
- Authors and institutional affiliations
- Country attribution (proportional for multi-country papers)
- Journal and publisher
- Original publication and retraction dates
- Retraction reason codes
- Notice text
The data is publicly available via the Crossref/GitLab partnership since 2023.
Total volume — 69,911 records (April 2026)
The full database includes all retractions Retraction Watch has catalogued, across all reasons:
| Reason category | Approximate share |
|---|---|
| Misconduct (fabrication, falsification, fraud, manipulation) | ~7.7% (5,390) |
| Plagiarism (in research context) | ~10–15% |
| Honest errors | ~25–30% |
| Duplicate publication | ~15–20% |
| Ethics issues (consent, approvals) | ~5–10% |
| Withdrawal at author request | ~5–10% |
| Other / multiple / unclear | ~15–20% |
The AMI's D6 dimension uses only the misconduct-linked subset (the first row above). The other categories are excluded.
Growth pattern
The database has grown substantially:
| Year | Approx. records |
|---|---|
| 2010 | ~5,000 |
| 2015 | ~10,000 |
| 2020 | ~30,000 |
| 2022 | ~45,000 |
| 2024 | ~65,000 |
| April 2026 | 69,911 |
What drove the growth
Pre-2020: gradual growth as Retraction Watch's coverage expanded and more retractions accumulated over time. Real underlying retraction events were growing too as journals improved post-publication review.
2020–2022: paper mill detection efforts. Several journals (notably PLOS, Wiley journals, Hindawi) ran systematic retraction campaigns identifying paper mill-produced content. Each campaign added thousands of records.
2023 Crossref partnership: the database was made openly available through Crossref. Coverage gaps were filled; record completeness improved.
2024–2025: continued paper mill batch retractions. Wiley/Hindawi retractions added several thousand records.
Country attribution
Absolute counts
By raw count, China leads — it produces the most papers globally, so high absolute count is partly a scale effect.
| Country | Approx. absolute count |
|---|---|
| China | Largest |
| India | Second |
| US | Third |
| Iran | Fourth |
| Italy | Fifth |
| Russia | Sixth |
These rough rankings reflect approximate database counts; exact figures shift as the database updates.
Per-publication rate (the AMI methodology)
Dividing by total publications from OpenAlex normalises for scale. The AMI's D6 dimension uses per-publication rates rescaled to 0–100:
| Country | D6 | Position |
|---|---|---|
| China | 100 | Top |
| Russia | 78 | High |
| India | 70 | High |
| Iran | 65 | High |
| Pakistan | 65 | High |
China still leads after normalisation but the gap closes. Russia and Iran are disproportionately high once you account for their smaller absolute publication volume.
What the 5,390 misconduct-linked records show
Fabrication and falsification
The clearest cases. Identifiable through statistical inconsistency, replication failure, or whistleblower reports. Approximately half of the 5,390 misconduct records are in this category.
Image manipulation
Particularly common in biomedical research. Specialised image forensics tools (Imagetwin, PaperWatcher) have driven a wave of image manipulation detection 2020–2024.
Plagiarism in research
Direct copying in research papers. Less common in modern publications than in older retracted papers; detection is now stronger pre-publication.
Manipulation of peer review
A growing category. Authors recommending fake reviewers, then those "reviewers" recommending acceptance. Several major journals have run retraction campaigns for these cases.
Paper mills
The largest single-cause category in the 2020–2024 growth. Paper mills sell ready-made papers to authors needing publications for career progression. Batch retractions are characteristic.
Specific famous cases in the database
The database includes individual records for the major cases discussed in academic integrity literature:
- Diederik Stapel (Netherlands, 2011) — dozens of records, mostly social psychology
- Hwang Woo-suk (South Korea, 2005–2006) — stem cell research retractions
- STAP cells / Haruko Obokata (Japan, 2014) — Nature retractions
- Marc Hauser (US, 2010) — cognitive psychology
- Macchiarini (Sweden, 2014–2016) — trachea transplant research
- Wansink (US, 2018–2020) — food behaviour research [verify]
Each case is searchable in the Crossref/GitLab interface.
How the AMI uses the data
The AMI methodology:
- Filter to misconduct-linked retractions (the 5,390 subset)
- Country-attribute via author affiliation (proportional for multi-country)
- Divide by OpenAlex publication counts for matching country and time period
- Calculate retractions per 10,000 publications
- Rescale to 0–100 across the 39-country set
The result is each country's D6 dimension score.
Limitations
Detection-incidence confound
Retractions measure what gets caught. Actual undetected fabrication is missing from the database.
Country attribution complexity
Multi-country papers require proportional attribution; methodology choices affect specific scores.
Reason coding inconsistency
Retraction notices use widely varying language; classification into misconduct categories requires interpretation by Retraction Watch staff.
Lag
Retractions often happen years after publication. Recent fabrication is under-represented.
Sources
- Retraction Watch Database on GitLab
- AMI v1.5 methodology document
- OpenAlex publication count data
- Fang, Steen & Casadevall (2012), PNAS
Full methodology | Download dataset
Related
Frequently asked questions
How many retraction records are in the Retraction Watch database?
As of April 2026, the Retraction Watch database contains 69,911 retraction records. Of these, 5,390 are classified as misconduct-linked (fabrication, falsification, image manipulation, fraud). The remainder are honest-error retractions, duplicate publications, ethics issues, and other non-misconduct categories. The AMI's D6 dimension uses only the misconduct-linked subset.
Which countries have the most retractions?
In absolute count, China leads, followed by India and the US. After normalising by publication volume (retractions per 10,000 papers), the ranking shifts — China still leads but Russia, Iran, Egypt, and Pakistan move up the per-publication rankings. The AMI's D6 dimension uses the normalised rates, not absolute counts.
Why has the Retraction Watch database grown so rapidly?
Three factors. First, actual misconduct detection has improved through tools, peer review developments, and post-publication review platforms like PubPeer. Second, the 2023 Crossref partnership made the database substantially more accessible and complete. Third, systematic paper mill detection efforts have accelerated retraction processing — large clusters of paper mill papers being retracted in batches.
How to cite this article
APA: Booth, F. (2026). Retraction Watch: Analysing the 69,911-Record Database. Academic Misconduct Index. https://academicmisconductindex.com/blog/retraction-watch-69911-records
BibTeX: @misc{booth2026retraction, author={Booth, Francisco}, title={Retraction Watch: Analysing the 69,911-Record Database}, year={2026}, url={https://academicmisconductindex.com/blog/retraction-watch-69911-records}}
Francisco Booth
Independent researcher, founder of the Academic Misconduct Index
Related posts