Helium
Helium

Reputation: 59

To generate md5sum for each value in a column (.CSV file) and then compare the generated md5sum value with the already given md5sum value

I am a beginner with Linux and still learning and I am not able to proceed with a task

So there is a CSV file with few columns.

Column 1 - File name,

Column 2 - Path to the file,

Column 3 - md5sum values of that file( for each row).

My goal is to generate the md5sum value of the file ( column 1) by going to the path mentioned(column 2) and then compare the value with md5sum already present in the CSV file, all the while ignoring the first three rows ( headers). For all the rows in the CSV file

Example

cat Sample.csv

header1 
header2
file,pathTofile,md5sum
script.sh,/c/folder,987fg98df7g9df7g94353454
another.sh,/c/training/folder,54657981sdssgs654643535

OUTPUT ( assuming row 1 has the correct md5sum value and row 2 does not )

md5sum is a match for script.sh
md5sum is not a match for another.sh

Thanks in advance

Upvotes: 0

Views: 521

Answers (1)

Socowi
Socowi

Reputation: 27255

From man md5sum

-c, --check
read MD5 sums from the FILEs and check them

Here the FILE has the same format as md5sum's output:

bb8c5900589a82f48e15c2688670de39  file1
f23d2d7f519425c547d9e4287940ef72  /path/to/file2
...

So you can re-arrange your csv file to have the same format and then run md5sum -c:

awk -F, 'NR>3 {print $3"  "$2"/"$1}' Sample.csv | md5sum -c

NR>3 skips over your header. If your example isn't accurate, make sure to replace 3 with the actual number of header lines.

The output of md5sum -c looks like

file1: OK
/path/to/file2: OK
some/corrupted/file: FAILED
file4: OK
...
md5sum: WARNING: 1 computed checksum did NOT match

Upvotes: 2

Related Questions