Reputation: 133
The question was answered here to the OP's liking, but I couldn't make it work 100% for me. What I want to do is take files in /home/hermit/Documents/Pictures
and hash it before putting the new files in /home/hermit/Documents/HashPictures
, while keeping the file in /home/hermit/Documents/
. Unfortunately, the solution doesn't seem to work for GIFs and JPGs.
Or GNU sed can do it even shorter:
# md5sum * | sed -e 's/\([^ ]*\) \(.*\(\..*\)\)$/mv -v \2 \1\3/e'
Another thing that would be nice is if I could have a script that is easily read or an explanation to go along with the script.
EDIT: These are the remaining files (in /home/hermit/Documents/Pictures
and terminal output. Remaining files: File names inside
hermit@hermit:~/Documents/PicturesHashed$ ./hash.sh
mv: target '9c48b6846aa3211ba867d9775aa9a730.jpg' is not a directory
mv: target '6cef7445eb7382aa719e364dc2d0126c.jpg' is not a directory
mv: target 'b3624eae0010f7d042af838859d5ea0e.png' is not a directory
mv: target '12f8f700cc73abe05da61103184f2ed0.jpg' is not a directory
mv: target '340e018ba57016f469a1039fb19c2619.jpg' is not a directory
mv: target '89da545ea3084500cd86a6265676173c.jpg' is not a directory
mv: target '7ff0671fc0447ca009d216670a0e2ac9.gif' is not a directory
mv: target '300d7e1e9807701f1a5043de85992484.jpg' is not a directory
mv: target 'c340521eec897957c0a7d6f415232ae4.png' is not a directory
mv: target '263ef6fd0b8623227a705bbcecb61755.gif' is not a directory
mv: target '2f4e522461ff467d5b4a09b7d33c2114.jpg' is not a directory
mv: target '2372edeb385381540d2230266ad5a4d2.png' is not a directory
mv: target 'bf5fc13be51d281347e0b00694c7689b.jpg' is not a directory
mv: target '3ab04030a8d06ff5aa5dca406c3927b0.jpg' is not a directory
mv: target '84d61abe2ff50e81d96e9b5ca916048e.jpg' is not a directory
mv: target 'c1c74496d880e4a20403c65e583dff54.jpg' is not a directory
mv: target '99c2a10e1f4ce27a08eafb70cbac09c1.jpg' is not a directory
mv: target '7ff0671fc0447ca009d216670a0e2ac9.gif' is not a directory
mv: target 'e27c3fe527a6417e13f2b55865b77d4f.jpg' is not a directory
mv: target 'd32b6aa0ff3929b477fe5e33872220d1.png' is not a directory
mv: target '70df8a56449a7b19b286e0b77394a7c8.jpg' is not a directory
mv: target '7e9b7446ea3fe662fa7ba3ba45952cbf.jpg' is not a directory
mv: target '975de97e64c345cbe41532101636c70e.gif' is not a directory
mv: target 'c3a691daa3400f00c87de37703ddd222.jpg' is not a directory
sh: 1: Syntax error: "(" unexpected
sh: 1: Syntax error: "(" unexpected
mv: target 'ce14ef4371c5fe6a61a539a9f22e6227.jpg' is not a directory
Upvotes: 3
Views: 5243
Reputation: 12798
Here's a Python solution. Put this in a Python file in the same directory you want to convert (or modify the '.'
).
import hashlib
import os
def file_as_bytes(file):
with file:
return file.read()
def hash_file(fpath):
return hashlib.md5(file_as_bytes(open(fpath, 'rb'))).hexdigest()
for fname in os.listdir('.'):
name, ext = os.path.splitext(fname)
hash = hash_file(fname)
dst = hash + ext
print(fname + " --> " + dst)
os.rename(fname, dst)
Upvotes: 0
Reputation: 11
Rename all files not dirs in the current dir to
md5sum * | awk '{print "mv", $2, $1 ".jpg" }' | bash
or
md5sum -- * | awk '{print "mv --", $2, $1 ".jpg" }' | bash
if files start with minuses.
Upvotes: 1
Reputation: 14883
Part A - What you've seen
Or GNU sed can do it even shorter:
# md5sum * | sed -e 's/\([^ ]*\) \(.*\(\..*\)\)$/mv -v \2 \1\3/e'
I personally hate using sed
for these cases if I was given this to review in professional code would reject it because it is so darn hard for future readers to understand.
Sed is a stream editor. You feed stuff into it edits it and then pushes out the result. It uses regular expressions to match patters on its input and then do something with them. Regular expressions are very hard to read even when you've worked with them for a while so I don't expect a lot of people to be able to read the above code. People tend to use it because it can do a lot with very little code.
Sed has a lot of party tricks, and in this case its being used to execute other commands (mv
).
md5sum *
is producing an output like this:
263620ac1a08b934b5312f416fe7a1af IMAG0001.jpg
972eddbf8e368a9c3d38e66bcf924cbc IMAG0002.jpg
94b30dfedb8afb7143268d1c329d7e64 IMAG0004.jpg
c592b83172e7f3c2d20207ee4e0cdd0d IMAG0005.jpg
1bc861c1251d87aea5e98ff263e09e79 IMAG0223.jpg
560afa8d60ff833a9dee52eff2fc420b IMAG0224.jpg
Sed is then editing that to look like this:
mv -v IMAG0001.jpg 263620ac1a08b934b5312f416fe7a1af.jpg
mv -v IMAG0002.jpg 972eddbf8e368a9c3d38e66bcf924cbc.jpg
mv -v IMAG0004.jpg 94b30dfedb8afb7143268d1c329d7e64.jpg
mv -v IMAG0005.jpg c592b83172e7f3c2d20207ee4e0cdd0d.jpg
mv -v IMAG0223.jpg 1bc861c1251d87aea5e98ff263e09e79.jpg
mv -v IMAG0224.jpg 560afa8d60ff833a9dee52eff2fc420b.jpg
Sed is then executing this code.
Now you understand that, you can most likely pick out the mv -v
and swap it for a cp
or other command. But you are still going to have problems with spaces and special characters.
Part B - A more robust solution
I would avoid sed
altogether. If you don't understand it then don't use it. MOST people don't understand it.
for file in *.jpg
do
sum=`md5sum "$file"`
#remove the file name from md5sum's output
# this is using bash's pattern matching but can be swapped out
sum="${sum% $file}"
cp "$file" "HashPictures/$sum"
done
Notice that I have put quotes round both $file
and $sum
. Also we deal with one file per command, never one command for every file. This way spaces in file names are never mixed up with spaces used to split command arguments.
Part C - Final Thoughts
For this example code I've used cp
to put a copy of the image in a new directory. That might not be what you want. For example use ln -s "$file" "HashPictures/$sum"
to create a symbolic link. This will avoid the need to duplicate the files and save a lot of space.
Upvotes: 2
Reputation: 241828
Perl to the rescue:
#!/usr/bin/perl
use warnings;
use strict;
use Digest::MD5 qw{ md5_hex };
my ($source, $target) = @ARGV;
$source =~ s/(\s)/\\$1/g;
for my $file (glob "$source/*") {
open my $fh, '<', $file or die "$file: $!";
my $content = do { local $/; <$fh> };
my $digest = md5_hex($content);
my ($extension) = $file =~ /\.([^.]*)/;
open my $out, '>', "$target/$digest.$extension" or die "$file: $!";
print {$out} $content;
close $out;
}
Run as
perl script-name -- "source-dir" "target-dir"
Upvotes: 0
Reputation: 2356
I find this easier to read and follow:
#!/bin/bash
source_dir=/home/hermit/Documents/Pictures
destination_dir=/home/hermit/Documents/HashPictures
for file in "${source_dir}"/*;do
hash=$(md5sum "${file}"|cut -d' ' -f1)
ext=${file##*.}
cp -v "$file" "${destination_dir}/${hash}.${ext}"
done
Upvotes: 2