Reputation: 189726
I'm trying to get a unified diff between many pairs of directories so I can ensure the comparison between pairs is consistent, and I want to know if there's a way to get diff
to format the output with relative rather than absolute paths.
Right now if I use diff -r -u PATH1 PATH2
then I get this kind of output:
diff -r -u PATH1/some/subfile.txt PATH2/some/subfile.txt
--- PATH1/some/subfile.txt Tue Feb 07 09:16:31 2017
+++ PATH2/some/subfile.txt Tue Feb 07 09:16:32 2017
@@ -70,7 +70,7 @@
*
* some stuff
*
- * I am Tweedledee and you are not
+ * I am Tweedledum and you are not
*/
void twiddle(void)
{
@@ -88,7 +88,7 @@
* check whether we should destroy everything
* and then destroy everything in either case
*/
-inline static void Tweedledee(void)
+inline static void Tweedledum(void)
{
if (should_destroy_everything())
{
I would rather get just the relative paths... is there any way to get diff
to do this? example:
diff -r -u PATH1/some/subfile.txt PATH2/some/subfile.txt
--- some/subfile.txt
+++ some/subfile.txt
@@ -70,7 +70,7 @@
*
* some stuff
*
- * I am Tweedledee and you are not
+ * I am Tweedledum and you are not
*/
void twiddle(void)
{
@@ -88,7 +88,7 @@
* check whether we should destroy everything
* and then destroy everything in either case
*/
-inline static void Tweedledee(void)
+inline static void Tweedledum(void)
{
if (should_destroy_everything())
{
This would make it easier to compare diff reports which are expected to be the same. (in my case PATH1
and PATH2
differ in each case, whereas the relative paths to files, and the exact content differences are the same)
Otherwise I have to filter this information out (either manually or with a script)
Upvotes: 0
Views: 626
Reputation: 189726
I bit the bullet and did this parsing in Python; it removes the diff blah blah blah
statements and relativizes the paths to a pair of specified root directories, also removing timestamps:
udiff_re = re.compile(r'^@@ -(\d+),(\d+) \+(\d+),(\d+) @@$')
diffitem_re = re.compile(r'^(\+\+\+|---) (.*)\s+.{7} \d\d \d\d:\d\d:\d\d \d{4}$')
def process_diff_output(output, dir1, dir2):
state = 0
lines_pending = [0,0]
result = []
for line in output.splitlines():
if state == 0:
if line.startswith('@@'):
m = udiff_re.search(line)
if m:
nums = [int(n) for n in m.groups()]
else:
raise ValueError('Huh?\n' + line)
result.append(line)
lines_pending = [nums[1],nums[3]]
state = 1
elif line.startswith('--- ') or line.startswith('+++ '):
m = diffitem_re.search(line)
whichone = m.group(1)
filename = m.group(2)
dirx = dir1 if whichone == '---' else dir2
result.append('%s %s' % (whichone, os.path.relpath(filename, dirx)))
elif line.startswith('diff '):
pass # leave the diff cmd out
else:
raise ValueError('unknown header line\n'+line)
elif state == 1:
result.append(line)
if line.startswith('+'):
lines_pending[1] -= 1
elif line.startswith('-'):
lines_pending[0] -= 1
else:
lines_pending[0] -= 1
lines_pending[1] -= 1
if lines_pending == [0,0]:
state = 0
return '\n'.join(result)
Upvotes: 0
Reputation: 469
I would pipe the output of your diff command to a sed script something like this: $ diff -r -u PATH1/some/subfile.txt PATH2/some/subfile.txt | sed '1s/PATH1\///' | sed '2s/PATH2\///' The script says": on line 1, replace "PATH1", followed by a single forward slash, by nothing, then, on line 2, replace "PATH2", followed by a single forward slash, by nothing. I'd have to create some content to test it, so I haven't tested it.
Upvotes: 1