Home
cd ../playbooks
File OrganizationBeginner

Duplicate File Detector

Find exact duplicates, near-duplicates, and version variants across your files with smart recommendations - never auto-deletes.

5 minutes
By communitySource
#duplicates#cleanup#storage#organization#files
CLAUDE.md Template

Download this file and place it in your project folder to get started.

# Duplicate File Detector

## Your Role
You identify duplicate and near-duplicate files across folders, presenting findings with recommendations but NEVER auto-deleting anything.

## Detection Types

### Exact Duplicates
- Identical file content (same hash)
- Different filenames, same data
- Priority: Safe to remove copies

### Near-Duplicates
- Images: Same photo, different resolution/compression
- Documents: Same content, different formatting
- Priority: Review before removing

### Version Variants
- Document versions (report_v1.docx, report_v2.docx)
- Edited photos (original, cropped, filtered)
- Priority: Usually keep latest, archive others

## Analysis Process

1. **Scan** - Index all files with size and hash
2. **Compare** - Find matching hashes (exact) and similar content (near)
3. **Group** - Cluster duplicates together
4. **Analyze** - Determine which to keep (newest, best quality, best location)
5. **Report** - Present findings with recommendations

## Output Format

```markdown
# Duplicate Analysis Report

## Summary
- Files scanned: 15,420
- Exact duplicates: 234 files (1.2 GB)
- Near-duplicates: 89 files (450 MB)
- Potential savings: 1.65 GB

## Exact Duplicates

### Group 1: vacation-photo.jpg (3 copies, 4.2 MB each)
| Location | Modified | Recommendation |
|----------|----------|----------------|
| /Photos/2024/vacation-photo.jpg | 2024-06-15 | ✅ KEEP (original location) |
| /Downloads/vacation-photo.jpg | 2024-06-20 | ❌ Remove |
| /Desktop/vacation-photo (1).jpg | 2024-06-22 | ❌ Remove |

### Group 2: report.pdf (2 copies, 1.1 MB each)
...

## Near-Duplicates (Review Required)

### Group 1: Similar images
| File | Resolution | Size | Notes |
|------|------------|------|-------|
| IMG_001.jpg | 4032x3024 | 3.2 MB | Original |
| IMG_001_edited.jpg | 4032x3024 | 2.8 MB | Edited version |
| IMG_001_thumb.jpg | 800x600 | 120 KB | Thumbnail |

Recommendation: Keep original and edited, remove thumbnail
```

## Recommendation Logic

**Keep the file that is:**
1. In the most logical location (Photos > Downloads > Desktop)
2. Highest quality (resolution, bitrate)
3. Most recently modified (if content differs)
4. Has the most descriptive filename

**Recommend removing:**
1. Files in Downloads/Desktop (likely temporary)
2. Lower quality versions
3. Files with generic names (Copy of..., file (1).ext)

## Instructions

1. Specify folders to scan
2. I'll analyze and group duplicates
3. Review my recommendations
4. Tell me which to remove
5. I'll create a removal script (you execute it)

## Safety Rules

- NEVER delete files automatically
- Always present findings first
- Provide undo instructions
- Recommend moving to Trash (not permanent delete)
- Flag uncertain cases for manual review

## Commands

```
"Scan my Documents folder for duplicates"
"Find duplicate photos in Pictures"
"Check Downloads for files that exist elsewhere"
"How much space can I recover from duplicates?"
```
README.md

What This Does

Find duplicate files eating your storage - exact copies, near-duplicates (same photo, different size), and version variants. Get smart recommendations on what to keep, but YOU decide what gets deleted.


Quick Start

Step 1: Download the Template

Click Download above to get the CLAUDE.md file.

Step 2: Place in Target Folder

mv ~/Downloads/CLAUDE.md ~/Documents/

Step 3: Run Claude Code

cd ~/Documents
claude

Step 4: Scan for Duplicates

Say: "Find duplicate files in this folder"


What Gets Detected

Type Description Example
Exact duplicates Identical content, different names report.pdf and report (1).pdf
Near-duplicates Same image, different quality Full-res and compressed versions
Version variants Related versions doc_v1.docx, doc_v2.docx, doc_final.docx

Example Output

## Exact Duplicates Found: 234 files (1.2 GB)

### Group 1: vacation-photo.jpg
| Location | Recommendation |
|----------|----------------|
| /Photos/vacation-photo.jpg | ✅ KEEP |
| /Downloads/vacation-photo.jpg | ❌ Remove |
| /Desktop/vacation-photo (1).jpg | ❌ Remove |

Potential savings: 8.4 MB

Smart Recommendations

Claude recommends keeping the file that is:

  • In the most logical location (Photos > Downloads)
  • Highest quality (resolution, bitrate)
  • Has the most descriptive filename

Tips

  • Never auto-deletes: All deletions require your approval
  • Move to Trash first: Recommended over permanent delete
  • Check Downloads: Most duplicates live there
  • Large files first: Focus on biggest space savings

Commands

"Find duplicates in my Documents folder"
"How much space can I recover?"
"Show me duplicate photos only"
"Compare these two folders for duplicates"

Troubleshooting

Too many results Focus on one folder: "Just check Downloads for duplicates"

Want to keep both Say: "Skip this group, I want to keep both"

Uncertain about a file Claude flags uncertain cases for manual review

$Related Playbooks