Reputation: 377
I would like to extract the numbers from the below string element of a list of n length into a list in their original form:
list = ['25 birds, 1 cat, 4 dogs, 101 ants']
output = [25, 1, 4, 101]
I'm quite new to regex so I've been trying with the following:
[regex.findall("\d", list[i]) for i in range(len(list))]
However, the output is:
output = [2, 5, 1, 4, 1, 0, 1]
Upvotes: 0
Views: 457
Reputation: 298
Code:
import re
list_ = ['25 birds, 1 cat, 4 dogs, 101 ants']
output = list(map(int, re.findall('\d+', list_[0])))
print(output)
output:
[25, 1, 4, 101]
Explanation:
re.findall
returns list of string where strings are scanned from left to right, matches are return in the order found.
map
applies int to each item in list of string and returns map object
list
Since map object is iterator, pass it as argument to factory method for creating list
Upvotes: 0
Reputation: 6935
Try this :
list_ = ['25 birds, 1 cat, 4 dogs, 101 ants']
import re
list(map(int, re.findall('\d+', list_[0])))
Output:
[25, 1, 4, 101]
Also, avoid assigning variable names as list
.
Upvotes: 1
Reputation: 301
You can use the following function to achieve this. I used re.compile
given that it is a bit faster than calling re
functions straight out of the module, if you have really long lists.
I also used yield
and finditer
since I do not know how long your lists will be, so this will provide some memory efficiency, given their lazy evaluation.
import re
def find_numbers(iterable):
NUMBER = re.compile('\d+')
def numbers():
for string in iterable:
yield from NUMBER.finditer(iterable)
for number in numbers():
yield int(number.group(0))
print(list(find_numbers(['25 birds, 1 cat, 4 dogs, 101 ants'])))
# [25, 1, 4, 101]
Upvotes: 0
Reputation: 23
We don't really need to use regex to get numbers from a string.
lst = ['25 birds, 1 cat, 4 dogs, 101 ants']
nums = [int(word) for item in lst for word in item.split() if word.isdigit()]
print(nums)
# [25, 1, 4, 101]
Equivalent without list comprehension:
lst = ['25 birds, 1 cat, 4 dogs, 101 ants']
nums = []
for item in lst:
for word in item.split():
if word.isdigit():
nums.append(int(word))
print(nums)
# [25, 1, 4, 101]
Upvotes: 2