Guide20 May 2026

How Turnitin Works: Plagiarism Detection and AI Detection

Turnitin is the most widely deployed academic plagiarism detection system globally. This guide explains how it works, what it catches well, and its known limitations including for AI-generated content.

TL;DR

Turnitin compares submitted text against a corpus of web content, published literature, and previously submitted student work. AI detection was added in 2023. Used at most major Anglophone universities. Effective against direct copying; limited on patchwriting and AI-generated content (Scarfe 2024: 94% AI miss rate).

Turnitinplagiarism detectionAI detectionacademic integrityguide

TL;DR

Turnitin compares submitted text against a corpus of web content, published literature, and previously submitted student work. The result is a similarity score and a detailed match report. AI detection was added in 2023 but reliability is limited. Used at most major Anglophone universities; widely deployed globally.

How the core plagiarism detection works

The corpus

Turnitin maintains a large corpus of text used for matching:

Internet content — web crawl covering public web pages
Academic publications — partnerships with publishers (Elsevier, Springer, Wiley, etc.) provide access to published literature
Student submissions — papers submitted by students at institutions using Turnitin are added to the corpus (with options to opt out per institution)
Other licensed databases — newspapers, magazines, e-books

The corpus size is large — tens of billions of documents. Institutional client base contributes to corpus growth.

Matching process

When a student submits a paper, Turnitin:

Processes the document (extracts text, normalises formatting)
Compares against the corpus using phrase-level similarity matching
Identifies matched passages and their sources
Generates a similarity report

The similarity report

The report shows:

Overall similarity percentage (e.g. "23% similarity")
Individual matches highlighted in the text
Source documents for each match
Filter options to exclude quoted text, bibliographies, small matches

Instructor interpretation

The similarity percentage alone does not indicate plagiarism. Instructors interpret the report:

23% with most matches being properly quoted and cited: usually fine
23% with the same matches being uncredited copying: misconduct
5% with one large uncited passage: misconduct
50% but all properly attributed: fine

The interpretation step is critical. Turnitin generates evidence; humans determine whether the evidence indicates misconduct.

AI detection — added 2023

Following ChatGPT's late 2022 launch, Turnitin developed AI-content detection capability:

Approach: statistical analysis of text features that distinguish AI-generated from human-written text. AI text has different word-distribution patterns, sentence-structure variation, and idiomatic characteristics.
Output: a percentage estimate of AI-generated content in the submission
Limitations: false positives (especially for non-native English speakers, who can produce text with patterns that flag as AI) and false negatives (lightly edited AI text and longer-form AI text often pass undetected)

Scarfe et al. (2024)

The University of Reading study (Scarfe, P., et al., 2024) submitted AI-generated work through normal coursework channels at Reading. 94% of submissions went undetected — meaning the combination of human review and automated detection caught only 6%. The result indicates that current AI detection technology is well below what would be needed for reliable misconduct prevention.

The detection-evasion dynamic

AI submissions resist detection through:

Light editing: students editing AI text reduce its statistical signatures
Iterative prompting: students using AI to generate text from extensive prompts produce more human-like output
Paraphrasing: AI-generated content paraphrased by other AI or by the student passes more detection
Hybrid drafts: combining AI and human writing produces text without clear statistical signatures

Detection vendors are iterating; the cat-and-mouse dynamic is structural.

Deployment globally

The AMI's R-Score Detection tools sub-component reflects deployment scope. The highest scoring countries:

UK (R_det=90)
Australia (R_det=85)
US (R_det=80)
Ireland (R_det=75)
Canada (R_det=75)
New Zealand (R_det=70)

These are Anglophone countries where Turnitin has near-universal university adoption. AI detection has been rolled out alongside the existing plagiarism detection capability.

Language coverage and alternatives

Turnitin's language coverage

Strong: English (largest corpus)
Good: Spanish, French, German, Italian, Portuguese, Polish
Limited: many less-resourced languages

Language-specific alternatives

Some countries operate domestic detection systems:

Antiplagiat (Russia) — Russian-language detection
CopyKiller (South Korea) — Korean-language detection
JSA (Poland) — Polish-language detection, mandatory for theses
Compilatio (France) — French-language detection
PlagScan — German-language detection, now part of Turnitin

These systems often complement rather than replace Turnitin, with institutions running both for different language documents.

Strengths and limitations

What Turnitin catches well

Direct copying from publicly accessible web sources
Direct copying from major published literature
Cross-student copying within institutional and inter-institutional corpora
Self-plagiarism (with appropriate corpus settings)

What Turnitin misses or struggles with

Patchwriting and heavy paraphrasing
Translation plagiarism (copying from foreign-language sources)
Contract cheating (the original work is not in the corpus)
AI-generated content (currently)
Recently published content not yet indexed

Inherent limits

Turnitin can only match against its corpus. Work copied from sources Turnitin does not have access to (proprietary databases, recently-written content not yet indexed, private documents) cannot be matched. Contract cheating produces "original" text that Turnitin has never seen — making it Turnitin-invisible by design.

Sources

Turnitin product documentation
Scarfe, P., et al. (2024). University of Reading AI submission study
AMI v1.5 methodology document
Vendor and corpus partnership documentation

Full methodology | Download dataset

Read the full methodology

Frequently asked questions

How does Turnitin detect plagiarism?

Turnitin compares submitted text against a large corpus including web content, published academic literature, and previously submitted student work (institutional and inter-institutional repositories). It produces a similarity report showing matched text and the percentage of the submission matched. Instructors review the report to distinguish acceptable matches (quotation, citation) from misconduct.

Can Turnitin detect ChatGPT and AI?

Turnitin added AI detection capability in 2023. The detector identifies text statistically likely to be AI-generated. However, reliability is limited — false positives and false negatives both occur. Scarfe et al. (2024) found that 94% of AI-generated submissions went undetected in a controlled study at the University of Reading. AI detection is an evolving capability rather than a solved problem.

What languages does Turnitin support?

Turnitin's core plagiarism detection is strongest in English, with substantial coverage in major European languages (Spanish, French, German, Italian, Portuguese, Polish). Less-resourced languages have weaker coverage. Other detection systems — Antiplagiat (Russian), CopyKiller (Korean), JSA (Polish), Compilatio (French) — provide language-specific alternatives in their respective markets.

How to cite this article

APA: Booth, F. (2026). How Turnitin Works: Plagiarism Detection and AI Detection. Academic Misconduct Index. https://academicmisconductindex.com/blog/how-turnitin-works

BibTeX: @misc{booth2026how, author={Booth, Francisco}, title={How Turnitin Works: Plagiarism Detection and AI Detection}, year={2026}, url={https://academicmisconductindex.com/blog/how-turnitin-works}}

Francisco Booth

Independent researcher, founder of the Academic Misconduct Index

Country Profile

Why Australia Scores Best on Academic Integrity — And What Other Countries Can Learn

Guide

What Is Contract Cheating? Definition, Examples, and Global Data

Country Profile

United Kingdom: Academic Misconduct Index Country Profile

← Back to all posts

How Turnitin Works: Plagiarism Detection and AI Detection

TL;DR

How the core plagiarism detection works

The corpus

Matching process

The similarity report

Instructor interpretation

AI detection — added 2023

Scarfe et al. (2024)

The detection-evasion dynamic

Deployment globally

Language coverage and alternatives

Turnitin's language coverage

Language-specific alternatives

Strengths and limitations

What Turnitin catches well

What Turnitin misses or struggles with

Inherent limits

Sources

Related

Frequently asked questions

How does Turnitin detect plagiarism?

Can Turnitin detect ChatGPT and AI?

What languages does Turnitin support?