Miakatt
Miakatt

Reputation: 51

Locate position of string in file python

I have a large txt file in which I want to locate a particular set of strings and extract the numbers that follow them. For example:

26.08.15 14:52:04 Pressure 1.02 Temperature 32.5 NOb 10993 VB 28772  
.... <other stuff>
26.08.15 14:53:06 Pressure 1.03 Temperature 31.6 NOb 10993 VB 28008 
.... <other stuff>

etc.

I want to be able to find String = Temperature and extract the numerical value that follows. I've seen examples that tell me if the string exists, but nothing that tells me where it is or how to index the info that follows it. Is this something that can be done in Python?

Upvotes: 2

Views: 3180

Answers (3)

MadRabbit
MadRabbit

Reputation: 2520

I hate regular expressions so here is pure python solution.

lines = "26.08.15 14:52:04 Pressure 1.02 Temperature 32.5 NOb 10993 VB 28772 .... 26.08.15 14:53:06 Pressure 1.03 Temperature 31.6 NOb 10993 VB 28008 ...."
lines = lines.split()
for n, word in enumerate(lines):  
    if word in ['Temperature', 'Pressure']:
        print(word, lines[n+1]) 

Upvotes: 2

giraffe.guru
giraffe.guru

Reputation: 520

This could be achieved by manually reading the file word-by-word, or by using python's regular expressions. In my opinion, using regular expressions leads to more concise code without loss of readability so I'll focus on that solution.

From to the python documentation for the re module (https://docs.python.org/3/library/re.html):

(?<=...) Matches if the current position in the string is preceded by a match for ... that ends at the current position.

This example looks for a word following a hyphen:

m = re.search('(?<=-)\w+', 'spam-egg')
m.group(0)

In your example, you want to search after each occurrence of "Temperature " for any number of digits \d+, optionally a literal decimal point \.? and more digits \d+?. The re.findall() function could be useful.

Upvotes: 0

tjjjohnson
tjjjohnson

Reputation: 3410

You can use regular expression group matching

import re
with open("example.txt") as f:
    for line in f:
        m = re.match(".* Temperature (.*?) .*", line)
        if m:
            try:
                number = float(m.group(1))
                print(number)
            except ValueError:
                pass # could print an error here because a number wasn't found in the expected place

Upvotes: 2

Related Questions