Reputation: 449
I am currently trying to use regex to isolate values within string values from list and append the only the numbers to a new list. Yes, I am aware of this post (Regular Expressions: Search in list) and am using one of the answers from it but for some reason it is still including the text part of the values in the new list.
[IN]:
['0.2 in', '1.3 in']
snowamt = ['0.2 in', '1.3 in']
r = re.compile("\d*\.\d*")
newlist = list(filter(r.match, snowamt)) # Read Note
print(newlist)
[OUT]:
['0.2 in', '1.3 in']
I have tried so many combinations of regex and I just can't get it. Can someone please correct what I know is a stupid mistake. Here are just a few of the regex's I've tried:
"(\d*\.\d*)"
"\d*\.\d*\s"
"\d*\.\d*\s$"
"^\d*\.\d*\s$"
"^\d*\.\d*\s"
My end goal is to sum up all the values in the list generated above and I was initially able to get around using re.compile by using re.split :
inches_n = [ ]
i = 0
for n in snowamt:
split = re.split(" ", n, maxsplit=0, flags=0)
inches_n.append(split[0])
i += 1
print(inches_n)
The problem is that the value '-- in' may show up in the original list as I am getting the numbers by scraping a website (weather underground which is okay to scrape) and it would less steps if I could just select for the numbers initially with regex because with re.split I have to add an extra step to reiterate through the new list and only select for the numbers.
Anyway can someone please correct my regex so I can move on with my life from this problem, thank you!
Upvotes: 0
Views: 92
Reputation: 195583
To get only digits from the list, you can use this example:
import re
snowamt = ["0.2 in", "1.3 in"]
r = re.compile(r"(\d+\.?\d*)")
newlist = [m.group(1) for i in snowamt if (m := r.match(i))]
print(newlist)
Prints:
['0.2', '1.3']
Upvotes: 1