🔎 Dokumenten-Ähnlichkeitsprüfer

Dateien vergleichen und ähnliche Inhalte mit einem Bericht erkennen

Dateien hochladen

Dateien hierher ziehen oder klicken zum Durchsuchen

Unterstützt: PDF, TXT, DOCX, DOC (max. 5 Dateien)

Free Document Similarity Checker — Compare PDF, DOCX & TXT Files

The TextDiffy document similarity checker lets you upload multiple files and instantly compare their content. Whether you're a teacher checking student submissions, a team reviewing document versions, or a writer detecting duplicate content between files — this tool gives you a clear, quantified similarity report.

What This Tool Does (and What It Doesn't)

This is a file-to-file comparison tool. It compares the documents you upload against each other. It does not search the internet for matching content. For internet-based plagiarism detection, dedicated services like Turnitin or Copyscape are required.

Ideal use cases:

  • Comparing two versions of a report, contract, or article.
  • Checking if student submissions are too similar to each other.
  • Detecting duplicate or recycled content across internal documents.
  • Verifying that a translated document covers all content from the source.

How the Similarity Algorithm Works

The tool uses a shingling algorithm combined with Jaccard similarity:

  • Each document is split into overlapping sequences of 5 consecutive words ("shingles" or "n-grams").
  • The shingle sets from each document are compared against every other document.
  • The Jaccard score is computed: shared shingles ÷ total unique shingles across both documents.
  • The resulting percentage indicates how much text is shared between two documents.

Understanding the Similarity Score

  • 0–20% — Low: Documents share very little text. Likely different content.
  • 20–40% — Moderate: Some common content. Could be coincidental overlap or shared sources.
  • 40–60% — High: Significant overlap. Warrants closer inspection of common passages.
  • 60–100% — Very High: Large portions of text are shared. Likely duplicate or near-identical content.

Supported File Types

Upload any of the following file formats:

  • TXT / MD — Plain text and Markdown files, parsed natively in the browser.
  • PDF — Parsed using Mozilla's PDF.js library. Text content is extracted from all pages.
  • DOCX / DOC — Parsed using Mammoth.js, which extracts raw text from Word documents.

Privacy & Security

Your files are never uploaded to any server. All parsing and analysis happens entirely in your browser using JavaScript. Your confidential documents, contracts, or academic work remain 100% private on your device at all times.

Downloading the Similarity Report

After running the analysis, click "Download PDF Report" to generate a professional PDF document. The report includes:

  • List of all analyzed files with sizes.
  • A similarity matrix showing the percentage between every pair of documents.
  • A verdict (Low / Moderate / High / Very High) for each comparison.
  • Common passages found between each pair of documents.