Cupple Kay
Cupple Kay

Reputation: 155

Bash get md5sum of File not path as String

I'm trying to save the md5sum of multiple java files in a text file, but as I see it, it creates an md5sum of the path and not the file itself.

find $FilesDirectory -iregex '.*\.java' | while read line; do

if [ -f "$line" ] 
    then
        echo -n $line | md5sum.exe | cut -d' ' -f1 | tr -d '\n' >> $FileName
        echo -n "-" >> $FileName
        echo -n $line | cut -d' ' -f2 >> $FileName
    fi
done

I'm also trying to eliminate any newline except for the last one. When I changed the path for the md5sum to a file that didn't exist it still made an md5sum. (I am using MINGW Shell)

Upvotes: 0

Views: 968

Answers (2)

tripleee
tripleee

Reputation: 189377

Indeed, md5sum without a file name reads its data (not its arguments) from standard input, and calculates a checksum for that.

Tangentially, echo filename | xargs md5sum is a workaround if you really need to read arguments from stdin.

But here, you have no reason to want to do that.

find "$FilesDirectory" -type f -iregex '.*\.java' \
    -exec md5sum + |
sed 's%  *.*/%-%' >"$FileName"

The -type f replaces if [ -f ... and the -exec ... + runs md5sum on all the found files. Then we simply post-process the output to put a dash instead of a run of spaces after the checksum. The regular expression matches spaces, then any character up to the last slash, and replaces it all. Thus it also removes the path name.

(If you have an old version of find you may have to use -exec md5sum {} \; instead.)

If all the files are in the current directory, and there are no directories which match the wildcard (in which case -type f is superfluous above, too) and there are not so many as to cause the wildcard to expand to a steing which is too long ("argument list too long") you can simply do

md5sum *.[jJ][aA][vV][aA] | ...

If you need to use find and you have subdirectories but don't want it to traverse them, add -maxdepth 1.

Upvotes: 1

David C. Rankin
David C. Rankin

Reputation: 84551

If you are simply worried about outputting the md5sum and filename to a file in the format md5sum-filename, then you can simply use pattern expansion and substring extraction to read the md5sum of a file (where ever it may live) and remove the path information leaving only the name to output:

find $FilesDirectory -iregex '.*\.java' | while read line; do

if [ -f "$line" ] 
    then
        md5str="$(md5sum "$line")"
        sum=${md5str%% *}
        sumfn=${md5str##* }
        sumfn=${sumfn##*/}

        echo "$sum-$sumfn" >> $FileName
    fi
done

Upvotes: 1

Related Questions