Perl - finding duplicate files
Jul. 2nd, 2012 07:43 pmBecause I imported a bunch of files from various directories - I wasn't quite sure if I duplicated my photo collection.
The solution? Well, probably quite a few. But here it is in Perl.
First download: Activestate Perl.
http://www.activestate.com/activeperl/downloads
Then, open your favourite text editor (I'm starting to quite like Textpad - http://www.textpad.com/download/ ) but notepad will do just fine.
Place into it the following code:
( Read more... )
You'll - probably - need to edit '@paths_to_process'.
In perl, that's a list. Lists are values separated by commas, and with a curvy bracket on each end.
We use single quotes, because / has a special meaning, and we just want the literal text.
You could therefore do
and this would do all of your C drive. I wouldn't suggest that as a good idea, as it'll take a long time (because it has to open and read every file on your hard disk).
SO I'd suggest sticking with directories that you know you've stuff that might be duplicated. (E.g. pictures directories - but this doesn't really care what the file type is).
Anyway - then save that as 'duplicate_finder.pl' (or anything you like, basically, as long as it ends '.pl' which tells Perl to 'work with' this file when you double click it). I'd suggest running it from a command prompt, but that's personal taste. (It prints text - the window will probably vanish after, but don't worry as there'll be a text file there with the results)
The solution? Well, probably quite a few. But here it is in Perl.
First download: Activestate Perl.
http://www.activestate.com/activeperl/downloads
Then, open your favourite text editor (I'm starting to quite like Textpad - http://www.textpad.com/download/ ) but notepad will do just fine.
Place into it the following code:
( Read more... )
You'll - probably - need to edit '@paths_to_process'.
In perl, that's a list. Lists are values separated by commas, and with a curvy bracket on each end.
We use single quotes, because / has a special meaning, and we just want the literal text.
You could therefore do
@paths_to_process = ( 'C:\' );
and this would do all of your C drive. I wouldn't suggest that as a good idea, as it'll take a long time (because it has to open and read every file on your hard disk).
SO I'd suggest sticking with directories that you know you've stuff that might be duplicated. (E.g. pictures directories - but this doesn't really care what the file type is).
Anyway - then save that as 'duplicate_finder.pl' (or anything you like, basically, as long as it ends '.pl' which tells Perl to 'work with' this file when you double click it). I'd suggest running it from a command prompt, but that's personal taste. (It prints text - the window will probably vanish after, but don't worry as there'll be a text file there with the results)