Reputation: 181
I am trying to extract strings in Python received by a function.
Consider the following;
I have a script that runs in Python. The script runs continuosly. It binds to a USB port and listens for incoming ZigBee data frames.
I have a function that dissassembles this dataframe;
# Decoded data frame to use later on
def decodeReceivedFrame(data):
source_addr_long = toHex(data['source_addr_long'])
source_addr = toHex(data['source_addr'])
id = data['id']
rf_data = data['rf_data']
#samples = data['samples']
return [source_addr_long, source_addr, id, rf_data]
When I print this function later on; it gives me the correct incoming values. For example;
decodedData = decodeReceivedFrame(data)
print decodedData
Output:
[None, None, 'rx', '<=>\x80\x02#387239766#XBEE#126#STR:wm2 #STR:c47cb3f57365#']
What I want to do, is to extract the two STR variables of this string. This means the wm2 String, and the c47cb3f57365 string, in two seperate variables.
Which function in Python would be the most efficient to solve this situation?
Upvotes: 1
Views: 839
Reputation: 180481
presuming the data is always in the format as discussed in the comments, this would be one of the most efficient ways:
s = '<=>\x80\x02#387239766#XBEE#126#STR:wm2 #STR:c47cb3f57365#'
# rsplit with 2 as the second arg will split twice on #STR starting from the send of s
spl = s.rsplit("#STR:",2)
# get last two elements from spl
a,b = spl[1],spl[2][:-1] # ,[:-1] removes the final #
print a,b
wm2 c47cb3f57365
Some timings using ipython and timeit:
In [6]: timeit re.findall(r'STR:(\w+)', s)
1000000 loops, best of 3: 1.67 µs per loop
In [7]: %%timeit
spl = s.rsplit("#STR:",2)
a,b = spl[1],spl[2][:-1]
...:
1000000 loops, best of 3: 409 ns per loop
If you were to use a regex you should compile:
patt = re.compile(r'STR:(\w+)')
patt.findall(s)
Which improves the efficiency:
In [6]: timeit patt.findall(s)
1000000 loops, best of 3: 945 ns per loop
Upvotes: 3
Reputation: 1623
>>> import re
>>> re.findall(r'STR:(\w+)', '<=>\x80\x02#387239766#XBEE#126#STR:wm2 #STR:c47cb3f57365#')
['wm2', 'c47cb3f57365']
Upvotes: 1