Linux: Shell Script for Removing Duplicate Files

So when it comes to removing files, this little script is the berries – it finds your duplicate files and proceeds to build a shell script to remove them all, called rem-duplicates.sh.

In here you will find all the lines to remove the duplicates commented out with a “#” – you then uncomment the ones to be removed, save and run the script. Simples!

#!/bin/bash
OUTF=rem-duplicates.sh;
echo "#! /bin/sh" > $OUTF;
find "$@" -type f -printf "%s\n" | sort -n | uniq -d | xargs -I@@ -n1 find "$@" -type f -size @@c -exec md5sum {} \; |
sort --key=1,32 | uniq -w 32 -d --all-repeated=separate | sed -r 's/^[0-9a-f]*( )*//;s/([^a-zA-Z0-9./_-])/\\\1/g;s/(.+)/#rm \1/' >> $OUTF;
chmod a+x $OUTF;
ls -l $OUTF

All credit here: Unix shell script for removing duplicate files.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.