🔎 문서 유사도 검사기
파일을 비교하고 유사한 콘텐츠를 탐지하는 상세 보고서 생성
파일 업로드
파일을 여기에 끌어다 놓거나 클릭하여 찾아보기
지원 형식: PDF, TXT, DOCX, DOC (최대 5개 파일)
Free Document Similarity Checker — Compare PDF, DOCX & TXT Files
The TextDiffy document similarity checker lets you upload multiple files and instantly compare their content. Whether you're a teacher checking student submissions, a team reviewing document versions, or a writer detecting duplicate content between files — this tool gives you a clear, quantified similarity report.
What This Tool Does (and What It Doesn't)
This is a file-to-file comparison tool. It compares the documents you upload against each other. It does not search the internet for matching content. For internet-based plagiarism detection, dedicated services like Turnitin or Copyscape are required.
Ideal use cases:
- Comparing two versions of a report, contract, or article.
- Checking if student submissions are too similar to each other.
- Detecting duplicate or recycled content across internal documents.
- Verifying that a translated document covers all content from the source.
How the Similarity Algorithm Works
The tool uses a shingling algorithm combined with Jaccard similarity:
- Each document is split into overlapping sequences of 5 consecutive words ("shingles" or "n-grams").
- The shingle sets from each document are compared against every other document.
- The Jaccard score is computed: shared shingles ÷ total unique shingles across both documents.
- The resulting percentage indicates how much text is shared between two documents.
Understanding the Similarity Score
- 0–20% — Low: Documents share very little text. Likely different content.
- 20–40% — Moderate: Some common content. Could be coincidental overlap or shared sources.
- 40–60% — High: Significant overlap. Warrants closer inspection of common passages.
- 60–100% — Very High: Large portions of text are shared. Likely duplicate or near-identical content.
Supported File Types
Upload any of the following file formats:
- TXT / MD — Plain text and Markdown files, parsed natively in the browser.
- PDF — Parsed using Mozilla's PDF.js library. Text content is extracted from all pages.
- DOCX / DOC — Parsed using Mammoth.js, which extracts raw text from Word documents.
Privacy & Security
Your files are never uploaded to any server. All parsing and analysis happens entirely in your browser using JavaScript. Your confidential documents, contracts, or academic work remain 100% private on your device at all times.
Downloading the Similarity Report
After running the analysis, click "Download PDF Report" to generate a professional PDF document. The report includes:
- List of all analyzed files with sizes.
- A similarity matrix showing the percentage between every pair of documents.
- A verdict (Low / Moderate / High / Very High) for each comparison.
- Common passages found between each pair of documents.