Reputation: 658
i have this file (output.txt)
Username:traider
domain:domain.net
TECH-1366
Username:traider1
domain:domain.net
TECH-1367
I can get values after Username and domain
traider,domain.net
traider1,domain.net
but don't know how to get TECH-XXX
desired output:
traider,domain.net,TECH-1366
traider1,domain.net,TECH-1367
Code:
with open ("output.txt", "r") as myfile:
data=myfile.read()
people = re.findall(r'\bUsername:(\S+)\s+domain:(\S+)\s', data)
for personinfo in people:
print(','.join(personinfo))
I can return only [TECH] but it's incomplete and has brackets
tech = re.findall(r'TECH-*', data)
Upvotes: 0
Views: 548
Reputation: 658
Finally found why nothing above worked: it's because of ^M
i had in file
It's visible only when open it in vim, when open it using cat it's not visible,once removed it with
sys.stdout = open('out.txt','wt')
with open ("output.txt", "r") as myfile:
data=myfile.read()
print data.replace('\r','')
and using @Wiktor Stribiżew code:
people = re.findall(r'\bUsername:(\S+)\s+domain:(\S+)\s+First Name:(\S+)\s+Last Name:(\S+)\s+(TECH-\d+)', data)
i got desired results, thanks everyone !!
Upvotes: 0
Reputation: 6369
You don't need a Regular Expression for this, you can use the built-in str.split()
and then e.g. a List Comprehension to "bundle" your data:
txt="""Username:traider
domain:domain.net
TECH-1366
Username:traider1
domain:domain.net
TECH-1367"""
l = txt.split()
#udt = [ l[i:i + 3] for i in range(0, len(l), 3)]
# equivalent to list-comprehension above
udt = []
for i in range(0, len(l), 3):
udt.append( l[i:i + 3] )
print(udt)
prints
[['Username:traider', 'domain:domain.net', 'TECH-1366'], ['Username:traider1', 'domain:domain.net', 'TECH-1367']]
To print that as desired:
for e in udt:
print(",".join(map(lambda f:f.split(":")[-1], e)))
prints
traider,domain.net,TECH-1366
traider1,domain.net,TECH-1367
and combined
d = [e.split(":")[-1] for e in txt.split()]
for i in range(0, len(d), 3):
print( ",".join(d[i:i+3]) )
Upvotes: 0
Reputation: 494
This can be done by splitting the text into items, further splitting to obtain the useful text within each item, followed by some simple conditional formatting:
txt="""Username:traider
domain:domain.net
TECH-1366
Username:traider1
domain:domain.net
TECH-1367"""
out = ''
for item in txt.split():
desired_value = item.split(':')[-1]
out += desired_value
if ':' in desired_value:
out += ','
else:
out += '\n'
Or using comprehension:
''.join('%s,' % item.split(':')[-1] if ':' in item else '%s\n' % item for item in txt.split())
Output:
traider,domain.net,TECH-1366
traider1,domain.net,TECH-1367
Upvotes: 0
Reputation: 2830
Try
people = re.findall(r'\bUsername:(\S+)\s+domain:(\S+).*(TECH-\d+)', data)
Upvotes: 1