pst
pst

Reputation: 92

Getting the line number of a string

Suppose I have a very long string taken from a file:

 lf = open(filename, 'r')
 text = lf.readlines()    
 lf.close()
   

or

lineList = [line.strip() for line in open(filename)]
text = '\n'.join(lineList)

How can one find specific regular expression's line number in this string( in this case the line number of 'match'):

 regex = re.compile(somepattern)
 for match in re.findall(regex, text):
      continue

Thank you for your time in advance

Edit: Forgot to add that the pattern that we are searching is multiple lines and I am interested in the starting line.

Upvotes: 0

Views: 2599

Answers (2)

Daweo
Daweo

Reputation: 36370

We need to get re.Match objects rather than strings themselves using re.finditer, which will allow getting information about starting position. Consider following example: lets say I want to find every two digits which are located immediately before and after newline (\n) then:

import re
lineList = ["123","456","789","ABC","XYZ"]
text = '\n'.join(lineList)
for match in re.finditer(r"\d\n\d", text, re.MULTILINE):
    start = match.span()[0]  # .span() gives tuple (start, end)
    line_no = text[:start].count("\n")
    print(line_no)

Output:

0
1

Explanation: After I get starting position I simply count number of newlines before that place, which is same as getting number of line. Note: I assumed line numbers are starting from 0.

Upvotes: 2

Ionut Hulub
Ionut Hulub

Reputation: 1

Perhaps something like this:

lf = open(filename, 'r')
text_lines = lf.readlines()    
lf.close()

regex = re.compile(somepattern)
for line_number, line in enumerate(text_lines):
  for match in re.findall(regex, line):
    print('Match found on line %d: %s' % (line_number, match))

Upvotes: 0

Related Questions