d789w
d789w

Reputation: 377

How can extract numbers from a string to a list as individual elements in python?

I would like to extract the numbers from the below string element of a list of n length into a list in their original form:

list = ['25 birds, 1 cat, 4 dogs, 101 ants']

output = [25, 1, 4, 101]

I'm quite new to regex so I've been trying with the following:

[regex.findall("\d", list[i]) for i in range(len(list))]

However, the output is:

output = [2, 5, 1, 4, 1, 0, 1]

Upvotes: 0

Views: 457

Answers (5)

AnswerSeeker
AnswerSeeker

Reputation: 298

Code:

import re

list_ = ['25 birds, 1 cat, 4 dogs, 101 ants']
output = list(map(int, re.findall('\d+', list_[0])))
print(output)

output:

[25, 1, 4, 101]

Explanation:

re.findall returns list of string where strings are scanned from left to right, matches are return in the order found.

map applies int to each item in list of string and returns map object

list Since map object is iterator, pass it as argument to factory method for creating list

Upvotes: 0

Arkistarvh Kltzuonstev
Arkistarvh Kltzuonstev

Reputation: 6935

Try this :

list_ = ['25 birds, 1 cat, 4 dogs, 101 ants']
import re
list(map(int, re.findall('\d+', list_[0])))

Output:

[25, 1, 4, 101]

Also, avoid assigning variable names as list.

Upvotes: 1

awakenedhaki
awakenedhaki

Reputation: 301

You can use the following function to achieve this. I used re.compile given that it is a bit faster than calling re functions straight out of the module, if you have really long lists.

I also used yield and finditer since I do not know how long your lists will be, so this will provide some memory efficiency, given their lazy evaluation.

import re

def find_numbers(iterable):
    NUMBER = re.compile('\d+')
    def numbers():
        for string in iterable:
            yield from NUMBER.finditer(iterable)

    for number in numbers():
        yield int(number.group(0))

print(list(find_numbers(['25 birds, 1 cat, 4 dogs, 101 ants'])))
# [25, 1, 4, 101]

Upvotes: 0

AngusWR
AngusWR

Reputation: 23

We don't really need to use regex to get numbers from a string.

lst = ['25 birds, 1 cat, 4 dogs, 101 ants']
nums = [int(word) for item in lst for word in item.split() if word.isdigit()]
print(nums)
# [25, 1, 4, 101]

Equivalent without list comprehension:

lst = ['25 birds, 1 cat, 4 dogs, 101 ants']
nums = []
for item in lst:
    for word in item.split():
        if word.isdigit():
            nums.append(int(word))
print(nums)
# [25, 1, 4, 101]

Upvotes: 2

Shep
Shep

Reputation: 166

You're missing a +

you find all should have "\d+", not just "\d"

Upvotes: 1

Related Questions