Reputation: 51
I parse a text file for the following line.
stp 11441 0 0 0 0
there are always 2 such line occurrences in the txt file. I am looking for the second value in the line (11441 here) and save it as a variable.
I have figured it out how to do such manipulations with only one variable. Here is the code I am using
import re
with open('cpu.txt', 'r') as file:
for line in file:
match = re.search('stp \d{2,100}', line)
if match:
stp_queue1 = match.group().split( )[1]
However, I can't get my head around how I specify a variable (stp_queue2
in that case) for the second time match occurrence.
In other words: if the file contains 2 following lines:
stp 11441 0 0 0 0
stp 20000 0 0 0 0
then stp_queue1
should be 11441 and stp_queue2
should be 20000 respectively.
Could you please help me with that?
Upvotes: 2
Views: 68
Reputation: 1870
If all you need is a list containing the that first number following stp
this may be enough:
with open('cpu.txt', 'r') as f:
stp_queue = [line.split()[1] for line in f]
print(stp_queue)
If you need to check whether the line begins with stp
just add that validation to the comprehension:
with open('cpu.txt', 'r') as f:
stp_queue = [line.split()[1] for line in f if line.startswith('stp')]
print(stp_queue)
Upvotes: 0
Reputation: 12669
There are many patterns you can use for this problem :
i am showing you three pattern, you can choose which you want :
first pattern :
import re
pattern=r'stp\s(\d+)'
output=[]
with open('file.txt','r') as f:
for line in f:
match=re.search(pattern,line)
output.append(match.group(1))
print(output)
output:
['11441', '20000']
Pattern 2:
r'[0-9]{5}'
pattern 3:
Positive Lookbehind (?<=stp\s)
pattern=r'(?<=stp\s)\d+'
Upvotes: 2
Reputation: 990
Use groups
in your regular expression.
Please read python regex documentation
Upvotes: -1
Reputation: 4985
If you put them in a list the order is preserved and look up is as easy as stp_queue[0]
import re
stp_queue = []
with open('cpu.txt', 'r') as file:
for line in file:
match = re.search('stp \d{2,100}', line)
if match:
stp_queue.append(match.group().split( )[1])
Upvotes: 1
Reputation: 851
You could add your values to a dictionary rather than each to its own variable. See the code below for adding each match to a dictionary with the key being stp_queue# with the number starting at 1.
import re
dictionary={}
with open('cpu.txt', 'r') as file:
counter=1
for line in file:
match = re.search('stp \d{2,100}', line)
if match:
dictionary["stp_queue"+str(counter)] = match.group().split( )[1]
counter++
print dictionary
Then to extract the data dictionary["stp_queue1"]
will return the value stored for the first match found.
More on dictionaries here: https://docs.python.org/2/tutorial/datastructures.html#dictionaries
Upvotes: 1