Tip #553 Remove duplicate files
The script below will find duplicate files (files with the same md5sum) in a specified directory and output a new shell script containing commented-out rm statements for deleting them. You can then edit this output to decide which to keep.OUTF=rem-duplicates.sh; echo "#! /bin/sh" > $OUTF; find "$@" -type f -print0 | xargs -0 -n1 md5sum | sort --key=1,32 | uniq -w 32 -d --all-repeated=separate | sed -r 's/^[0-9a-f]*( )*//;s/([^a-zA-Z0-9./_-])/\\\1/g;s/(.+)/#rm \1/' >> $OUTF; chmod a+x $OUTF; ls -l $OUTF
alias aptitude at awk bash bc cal cat cd colrm comm cp csh curl cut date dd df dialog diff dirname dpkg du fc find fuser grep gs gzip history iconv kill ksh last less ln ls lsof lynx m4 md5sum mkdir mkfifo mkisofs mv mysql nc netstat openssl OSX perl ping popd ps pushd python read redirection rm scp screen sed sort ssh stat sudo svn tail tar tee test top tr uniq vim wc wget xargs