Billy Fung
Billy Fung

Reputation: 33

Diff files in two folders ignoring the first line

I have two folders of files that I want to diff, except I want to ignore the first line in all the files. I tried

  diff -Nr <(tail -n +1 folder1/) <(tail -n +1 folder2/) 

but that clearly isn't the right way.

Upvotes: 3

Views: 2176

Answers (2)

ruakh
ruakh

Reputation: 183300

If the first lines that you want to ignore have a distinctive format that can be matched by a POSIX regular expression, then you can use diff's --ignore-matching-lines=... option to tell it to ignore those lines.

Failing that, the approach you want to take probably depends on your exact requirements. You say you "want to diff" the files, but it's not obvious exactly how faithfully your resulting output needs to match what you would get from diff -Nr if it supported that feature. (For example, do you need the line numbers in the diff to correctly identify the line numbers in the original files?)

The most precisely faithful approach would probably be as follows:

  • Copy each directory to a fresh location, using cp --recursive ....
  • Edit the first line of each file to prepend a magic string like IGNORE_THIS_LINE::, using something like find -type f -exec sed -i '1 s/^/IGNORE_THIS_LINE::/' '{}' ';'.
  • Use diff -Nr --ignore-matching-lines=^IGNORE_THIS_LINE:: ... to compare the results.
    • Pipe the output to sed s/IGNORE_THIS_LINE:://, so as to filter out any occurrences of IGNORE_THIS_LINE:: that still show up (due to being within a few lines of non-ignored differences).

Upvotes: 3

J&#252;rgen H&#246;tzel
J&#252;rgen H&#246;tzel

Reputation: 19727

Using Process Substitution ist the correct way to create intermediate input file descriptors. But tail doesnt work on folders. Just iterate over all the files in the folder:

for f in folder1/*.txt; do
    tail -n +2 $f | diff - <(tail -n +2 folder2/$(basename $f))
done

Note i used +2 instead of +1. tail line numbering starts at line 1 not 0

Upvotes: 2

Related Questions