Regex grep file contents and invoke command

Question

I have a file that has been generated containing MD5 info along with filenames. I'm wanting to remove the files from the directory they are in. I'm not sure how to go about doing this exactly.

filelist (file) contains:

MD5 (dupe) = 1fb218dfef4c39b4c8fe740f882f351a
MD5 (somefile) = a5c6df9fad5dc4299f6e34e641396d38

my command (which i would like to include with rm) looks like this:

grep -o "$(.*)$" filelist

returns this:

(dupe)
(somefile)

*almost good, although the parentheses need to be eliminated (not sure how). I tried using grep -Po "(?<=$).*(?=$)" filelist using a lookahead/lookaround, but the command didn't work.

The next thing I would like to do is take the output filenames and delete them from the directory they are in. I'm not sure how to script it, but it would essentially do:


rm dupe $target
rm somefile $target

grebneke · Accepted Answer

If I understand correctly, you want to take lines like these

MD5 (dupe) = 1fb218dfef4c39b4c8fe740f882f351a
MD5 (somefile) = a5c6df9fad5dc4299f6e34e641396d38

extract the second column without the parentheses to get the filenames

dupe
somefile

and then delete the files?

Assuming the filenames don't have spaces, try this:

# this is where your duplicate files are.
dupe_directory='/some/path'

# Check that you found the right files:
awk '{print $2}' file-with-md5-lines.txt | tr -d '()' | xargs -I{} ls -l "$dupe_directory/{}"

# Looks ok, delete:
awk '{print $2}' file-with-md5-lines.txt | tr -d '()' | xargs -I{} rm -v "$dupe_directory/{}"

xargs -I{} means to replace the argument (dupe filename) with {} so it can be used in a more complex command.

Regex grep file contents and invoke command

Answers (2)

Related Questions