Yaroslav Bulatov
Yaroslav Bulatov

Reputation: 57973

getting line-numbers that were changed

Given two text files A,B, what is an easy way to get the line numbers of lines in B not present in A? I see there's difflib, but don't see an interface for retrieving line numbers

Upvotes: 6

Views: 4041

Answers (2)

bgporter
bgporter

Reputation: 36564

difflib can give you what you need. Assume:

a.txt

this 
is 
a 
bunch 
of 
lines

b.txt

this 
is 
a 
different
bunch 
of 
other
lines

code like this:

import difflib

fileA = open("a.txt", "rt").readlines()
fileB = open("b.txt", "rt").readlines()

d = difflib.Differ()
diffs = d.compare(fileA, fileB)
lineNum = 0

for line in diffs:
   # split off the code
   code = line[:2]
   # if the  line is in both files or just b, increment the line number.
   if code in ("  ", "+ "):
      lineNum += 1
   # if this line is only in b, print the line number and the text on the line
   if code == "+ ":
      print "%d: %s" % (lineNum, line[2:].strip())

gives output like:

bgporter@varese ~/temp:python diffy.py 
4: different
7: other

You'll also want to look at the difflib code "? " and see how you want to handle that one.

(also, in real code you'd want to use context managers to make sure the files get closed, etc etc etc)

Upvotes: 12

Prashant Kumar
Prashant Kumar

Reputation: 22669

A poor man's solution:

with open('A.txt') as f:
    linesA = f.readlines()

with open('B.txt') as f:
    linesB = f.readlines()

print [k for k, v in enumerate(linesB) if not v in linesA]

Upvotes: 0

Related Questions