Reputation: 67
I want to extract Name and number from a given string and save it into two lists.
str = 'Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs .'
I want to acheive :
name = ['Dhoni','Kohli','Rohit','Dhawan']
values = ['100','150','50','250']
I tried to use negative-look ahead but did not succeed. I am trying to use the approach as match a word then a number then again a word. May be I am wrong in this approach. How this can be acheived?
What I tried :
pattern = r'^[A-Za-z]+\s(?!)[a-z]'
print(re.findall(pattern,str))
Upvotes: 2
Views: 751
Reputation: 1
This code basically give extract of **Name** and **Number** from a given string and save it into two lists and then store in dictionary in a form of key value pair.
import re
x = 'Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs.'
names=re.findall(r'[A-Z][a-z]*',x)
values=re.findall(r'[0-9]+',x)
dicts={}
for i in range(len(names)):
dicts[names[i]]=values[i]
print(dicts)
#Input: Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs.
#Output: {'Dhoni': '100', 'Kohli': '150', 'Rohit': '50', 'Dhawan': '250'}
#Input: A has 5000 rupees and B has 15000 rupees.C has 85000 rupees and D has 50000 rupees .
#Output: {'A': '5000', 'B': '15000', 'C': '85000', 'D': '50000'}
Upvotes: 0
Reputation: 11929
The pattern seems to be name scored value
.
>>> res = re.findall(r'(\w+)\s*scored\s*(\d+)', s)
>>> names, values = zip(*res)
>>> names
('Dhoni', 'Kohli', 'Rohit', 'Dhawan')
>>> values
('100', '150', '50', '250')
Upvotes: 0
Reputation: 163207
You might use 2 capturing groups instead:
\b([A-Z][a-z]+)\s+scored\s+(\d+)\b
import re
pattern = r"\b([A-Z][a-z]+)\s+scored\s+(\d+)\b"
str = "Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs ."
matches = re.finditer(pattern, str)
name = []
values = []
for matchNum, match in enumerate(matches, start=1):
name.append(match.group(1))
values.append(match.group(2))
print(name)
print(values)
Output
['Dhoni', 'Kohli', 'Rohit', 'Dhawan']
['100', '150', '50', '250']
Upvotes: 3