Reputation: 10191
I have a binary file mixed with ASCII in which there are some floating point numbers I want to find. The file contains some lines like this:
1,1,'11.2','11.3';1,1,'100.4';
In my favorite regex tester I found that the correct regex should be ([0-9]+\.{1}[0-9]+)
.
Here's the code:
import re
data = open('C:\\Users\\Me\\file.bin', 'rb')
pat = re.compile(b'([0-9]+\.{1}[0-9]+)')
print(pat.match(data.read()))
I do not get a single match, why is that? I'm on Python 3.5.1.
Upvotes: 1
Views: 1006
Reputation: 414565
How to find floating point numbers in binary file with Python?
float_re = br"[+-]? *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][+-]?\d+)?"
for m in generate_tokens(r'C:\Users\Me\file.bin', float_re):
print(float(m.group()))
where float_re
is from this answer and generate_tokens()
is defined here.
pat.match()
tries to match at the very start of the input string and your string does not start with a float and therefore you "do not get a single match".
re.findall("\d+\.\d+", data)
produces TypeError
because the pattern is Unicode (str
) but data
is a bytes
object in your case. Pass the pattern as bytes
:
re.findall(b"\d+\.\d+", data)
Upvotes: 2
Reputation: 21466
You can try like this,
import re
with open('C:\\Users\\Me\\file.bin', 'rb') as f:
data = f.read()
re.findall("\d+\.\d+", data)
Output:
['11.2', '11.3', '100.4']
re.findall
returns string
list. If you want to convert to float you can do like this
>>> list(map(float, re.findall("\d+\.\d+", data)))
[11.2, 11.3, 100.4]
Upvotes: 2