Reputation: 195
I have a file that contains thousands of lines, and in the file, there are some lines like:
Line 115463: 08:59:25.106 08:59:24.992877 ASDF_IIS_CFGDB GenMod Ta-SNS__GENERATED_EVENTS (Event2f, DIR = 13) rrc_UlUtranMsg.c (../../../HEDGE/UL3/ASDF/UtranMsg/Uplink/Code/Src) 987
Line 236362: 08:59:28.647 08:59:28.597827 ASDF_IIS_CFGDB GenMod Ta-SNS__GENERATED_EVENTS (Eventab, DIR = 1) rrc_UlUtranMsg.c (../../../HEDGE/UL3/ASDF/UtranMsg/Uplink/Code/Src) 934
Line 324964: 08:59:40.456 08:59:40.403644 ASDF_IIS_CFGDB GenMod Ta-SNS__GENERATED_EVENTS (Eventac, DIR = 1) rrc_UlUtranMsg.c (../../../HEDGE/UL3/ASDF/UtranMsg/Uplink/Code/Src) 934
Line 341172: 08:59:40.659 08:59:40.616565 ASDF_IIS_CFGDB GenMod Ta-SNS__GENERATED_EVENTS (Eventfb, DIR = 13) rrc_UlUtranMsg.c (../../../HEDGE/UL3/ASDF/UtranMsg/Uplink/Code/Src) 987
Line 373186: 08:59:41.174 08:59:41.104755 ASDF_IIS_CFGDB GenMod Ta-SNS__GENERATED_EVENTS (Event2f, DIR = 1) rrc_UlUtranMsg.c (../../../HEDGE/UL3/ASDF/UtranMsg/Uplink/Code/Src) 934
Line 480217: 08:59:44.481 08:59:44.389453 ASDF_IIS_CFGDB GenMod Ta-SNS__GENERATED_EVENTS (Eventx1, DIR = 1) rrc_UlUtranMsg.c (../../../HEDGE/UL3/ASDF/UtranMsg/Uplink/Code/Src) 934
Line 505424: 08:59:44.777 08:59:44.701709 ASDF_IIS_CFGDB GenMod Ta-SNS__GENERATED_EVENTS (Event1a, DIR = 1) rrc_UlUtranMsg.c (../../../HEDGE/UL3/ASDF/UtranMsg/Uplink/Code/Src) 934
I only need to extract the substring
'1a'
from
'SNS__GENERATED_EVENTS (Event1a, DIR = 1)'
and so on. So, basically, the two characters after '(Event'
And I need to store these in a list or somewhere else where I can use them.
How can I do this?
So far, I have tried the following code but it gives me some values mixed in:
events = []
for line in input_txt_file:
if "Ta-SNS__GENERATED_EVENTS " not in line: continue
parts = line.split('Event')
event_temp = [0]
for i,part in enumerate(parts):
if part.endswith("Ta-SNS__GENERATED_EVENTS ("): event_temp[0] = parts[i+1].split(None,1)[0].split(',',2)[0]
events.append(event_temp)
print events
The output I am getting is:
[[0], [0], ['2f'], ['2f'], ['ab'], ['ab'], [0], [0], ['ac'], ['ac'], ['fb'], .......]
Upvotes: 1
Views: 2941
Reputation: 140168
No need for regex here: just split according to Ta-SNS__GENERATED_EVENTS (Event
and take 2 first letters of the 2nd field if there's one:
events=[]
for line in input_txt_file:
toks = line.split("Ta-SNS__GENERATED_EVENTS (Event")
if len(toks)>1:
events.append(toks[1][:2])
EDIT: found a cool one-liner equivalent:
events=[tok[:2] for line in input_txt_file for i,tok in enumerate(line.split("Ta-SNS__GENERATED_EVENTS (Event")) if i==1]
Uses enumerate
and tests if the index of the splitted item is 1: means there are at least 2 items. In that case, take 2 first chars from the token.
EDIT2: Amber has even better using partition
to avoid the enumerate
hack:
events=[t for t in (l.partition("Ta-SNS__GENERATED_EVENTS (Event")[2] for l in input_txt_file) if t]
Upvotes: 1
Reputation: 92854
Short solution using re.findall()
function:
# change to your actual file path
with open('./text_files/events.txt', 'r') as fh:
l = re.findall(r'(?<=Ta-SNS__GENERATED_EVENTS \(Event)\w+', fh.read(), re.M)
print(l)
The output:
['2f', 'ab', 'ac', 'fb', '2f', 'x1', '1a']
Upvotes: 1
Reputation: 4539
I would do this as a substring search using the re
module, personally.
import re
for line in input_txt_file:
val = ''
val = re.search('SNS__GENERATED_EVENTS \(Event(.+?), DIR\)', line).group(1)
print(val)
Upvotes: 1
Reputation: 51807
If you know that it's always going to be in that position you can simply do:
hexes = [line[99:101] for line in file]
If there are lines that don't contain that text you can do:
hexes = [line[99:101] for line in file if 'Ta-SNS__GENERATED_EVENTS' in line]
Upvotes: 3
Reputation: 526573
If the line position is always fixed, Wayne's answer is the most efficient. If the position can vary a bit, this is a decent situation in which to use regex:
import re
events = []
for line in input_txt_file:
match = re.search(r'SNS__GENERATED_EVENTS.*?Event(..)', line)
if match:
events.append(match.group(1))
This searches each line for SNS__GENERATED_EVENTS
, followed by possibly some characters, followed by Event
and then two more characters, and grabs those two characters.
Upvotes: 6