Reputation: 23
i have a text like this:
text = 'Ronald Mayr: A\nBell Kassulke: B\nJacqueline Rupp: A \nAlexander Zeller: C\nValentina Denk: C \nSimon Loidl: A \nElias Jovanovic: B \nStefanie Weninger: B \nFabian Peer: C \nHakim Botros: B\nEmilie Lorentsen: B\n'
I need to get all the names that have ":B" value.. for example Bell Kassulke, Elias Jovanovic
I'm trying something like this
stu = re.findall('\w+.*.: B',text)
but this one gives me list like this:
['Bell Kassulke: B',
'Simon Loidl: B',
'Elias Jovanovic: B']
While I only need the names not this whole list. What exactly can I do?
Upvotes: 2
Views: 112
Reputation: 356
try this
'(' starts capturing
\w+
matches any word character (equal to [a-zA-Z0-9_])
Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)
.*
matches any character (except for line terminators)
Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
')' end of capturing
: B
matches the characters : B literally (case sensitive)
pattern='(\w+.*.): B'
re.findall(pattern,grades)
Upvotes: 1
Reputation: 627101
You can use
^(.*?):\s*B\s*$
See the regex demo
Details
^
- start of a string(.*?)
- Group 1 (the actual value of .findall
): any zero or more chars other than line break chars as few as possible:
- a colon\s*B\s*
- a B
enclosed with zero or more whitespaces$
- end of string/In Pandas, you may use
df['Col name here'].str.findall(r'^(.*?):\s*B\s*$').str.join(',')
Or, if you need a single match per value:
df['Results'] = df['Col name here'].str.extract(r'^(.*?):\s*B\s*$', expand=False)
Upvotes: 2
Reputation: 6574
You can add this line of code after your regex:
stu = [s.replace(': B', '') for s in stu]
Upvotes: 0