Reputation: 1
I'm trying to extract unit information from a text file. This function always returns 'm' regardless of the real unit in the file. What am I doing wrong?
def get_seba_unit(file):
with open(file) as f:
unit = ''
lines = f.readlines()
if lines[10].find('m'):
unit = 'm'
elif lines[10].find('cm'):
unit = 'cm'
elif lines[10].find('°C'):
unit = '°C'
print('found Unit: ' + unit + ' for sensor: ' + file)
return(unit)
Upvotes: 0
Views: 306
Reputation: 9863
If what you're looking for is a way to extract out units from your data, i'd use some simple regex like the below one:
import io
import re
from collections import defaultdict
data = io.StringIO("""
1cm
2m
3°C
1cm 10cm
2m 20m
3°C 30°C
""")
def get_seba_unit(file):
floating_point_regex = "([-+]?\d*\.\d+|\d+)"
content = file.read()
res = defaultdict(set)
for suffix in ['cm', 'm', '°C']:
p = re.compile(floating_point_regex + suffix)
matches = p.findall(content)
for m in matches:
res[suffix].add(m)
return dict(res)
print(get_seba_unit(data))
And you'd get an output like this one:
{'cm': {'1', '10'}, '°C': {'3', '30'}, 'm': {'2', '20'}}
Of course, the above code is just assuming your units will be floating point units but the main idea would be attacking this problem using regular expressions.
Upvotes: 0
Reputation: 15079
This does not do what you think it does:
if lines[10].find('m'):
find
returns the index of the thing you are looking for, or -1
if it's not found. So unless m
is the first character on the line (index 0
), your condition will always be True
(In Python a non-zero number is truthy)
You might want to try if 'm' in line[10]
instead
Also, check for cm
before m
, otherwise you'll never find cm
Upvotes: 1