Reputation: 51
I have a large txt file in which I want to locate a particular set of strings and extract the numbers that follow them. For example:
26.08.15 14:52:04 Pressure 1.02 Temperature 32.5 NOb 10993 VB 28772
.... <other stuff>
26.08.15 14:53:06 Pressure 1.03 Temperature 31.6 NOb 10993 VB 28008
.... <other stuff>
etc.
I want to be able to find String = Temperature and extract the numerical value that follows. I've seen examples that tell me if the string exists, but nothing that tells me where it is or how to index the info that follows it. Is this something that can be done in Python?
Upvotes: 2
Views: 3180
Reputation: 2520
I hate regular expressions so here is pure python solution.
lines = "26.08.15 14:52:04 Pressure 1.02 Temperature 32.5 NOb 10993 VB 28772 .... 26.08.15 14:53:06 Pressure 1.03 Temperature 31.6 NOb 10993 VB 28008 ...."
lines = lines.split()
for n, word in enumerate(lines):
if word in ['Temperature', 'Pressure']:
print(word, lines[n+1])
Upvotes: 2
Reputation: 520
This could be achieved by manually reading the file word-by-word, or by using python's regular expressions. In my opinion, using regular expressions leads to more concise code without loss of readability so I'll focus on that solution.
From to the python documentation for the re
module (https://docs.python.org/3/library/re.html):
(?<=...)
Matches if the current position in the string is preceded by a match for...
that ends at the current position.This example looks for a word following a hyphen:
m = re.search('(?<=-)\w+', 'spam-egg') m.group(0)
In your example, you want to search after each occurrence of "Temperature " for any number of digits \d+
, optionally a literal decimal point \.?
and more digits \d+?
. The re.findall()
function could be useful.
Upvotes: 0
Reputation: 3410
You can use regular expression group matching
import re
with open("example.txt") as f:
for line in f:
m = re.match(".* Temperature (.*?) .*", line)
if m:
try:
number = float(m.group(1))
print(number)
except ValueError:
pass # could print an error here because a number wasn't found in the expected place
Upvotes: 2