Havishaa Sharma
Havishaa Sharma

Reputation: 67

exclude a pattern using regex in python

I want to extract Name and number from a given string and save it into two lists.

    str = 'Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs .'

I want to acheive :

    name = ['Dhoni','Kohli','Rohit','Dhawan']
    values = ['100','150','50','250']

I tried to use negative-look ahead but did not succeed. I am trying to use the approach as match a word then a number then again a word. May be I am wrong in this approach. How this can be acheived?

What I tried :

   pattern = r'^[A-Za-z]+\s(?!)[a-z]'
   print(re.findall(pattern,str))

Upvotes: 2

Views: 751

Answers (3)

DRC my way
DRC my way

Reputation: 1

This code basically give extract of **Name** and **Number** from a given string and save it into two lists and then store in dictionary in a form of key value pair.
import re

x = 'Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs.'

names=re.findall(r'[A-Z][a-z]*',x)
values=re.findall(r'[0-9]+',x)
dicts={}
for i in range(len(names)):
    dicts[names[i]]=values[i]
    print(dicts)
#Input: Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs.
#Output: {'Dhoni': '100', 'Kohli': '150', 'Rohit': '50', 'Dhawan': '250'}

#Input: A has 5000 rupees and B has 15000 rupees.C has 85000 rupees and D has 50000 rupees .
#Output: {'A': '5000', 'B': '15000', 'C': '85000', 'D': '50000'}

Upvotes: 0

abc
abc

Reputation: 11929

The pattern seems to be name scored value.

>>> res = re.findall(r'(\w+)\s*scored\s*(\d+)', s)
>>> names, values = zip(*res)
>>> names
('Dhoni', 'Kohli', 'Rohit', 'Dhawan')
>>> values
('100', '150', '50', '250')

Upvotes: 0

The fourth bird
The fourth bird

Reputation: 163207

You might use 2 capturing groups instead:

\b([A-Z][a-z]+)\s+scored\s+(\d+)\b

regex demo

import re

pattern = r"\b([A-Z][a-z]+)\s+scored\s+(\d+)\b"
str = "Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan scored 250 runs ."

matches = re.finditer(pattern, str)
name = []
values = []
for matchNum, match in enumerate(matches, start=1):
    name.append(match.group(1))
    values.append(match.group(2))

print(name)
print(values)

Output

['Dhoni', 'Kohli', 'Rohit', 'Dhawan']
['100', '150', '50', '250']

Upvotes: 3

Related Questions