Reputation: 3923
The following are Python snippets
line = '3520005,"Toronto (Ont.)",C ,F,2503281,2481494,F,F,0.9,1040597,979330,630.1763,3972.4,1\r\n'
and I want to be able to get the first number sandwiched between two commas which in this case would be ,2503281,
However, what I came up with doesn't seem to work properly: m = re.search("\,([0-9])*\,",line)
only retains the last digit in the number.
Upvotes: 1
Views: 161
Reputation: 103814
Be warned that using a regex for parsing comma separated values is fraught with oversights, fragility and potential errors. If you can coerce this into something the csv module can do -- you will be better off.
That said, this works:
import re
st='''line = '3520005,"Toronto (Ont.)",
C ,F,2503281,2481494,F,F,0.9,1040597,979330,630.1763,3972.4,1\r\n'''
print re.findall(r"(\d+\.?\d*)",st)
prints:
['3520005', '2503281', '2481494', '0.9', '1040597', '979330', '630.1763', '3972.4', '1']
Here is the regex explanation.
Upvotes: 0
Reputation: 132018
Here is a non-regex solution:
>>> [item for item in line.split(',')[1:] if item.isdigit()][0]
'2503281'
Upvotes: 3
Reputation: 191749
The asterisk needs to go inside of the parentheses:
`",([0-9]*),"
Otherwise you only capture one of the digits. You also don't need the backslashes before the commas, but that doesn't matter.
You may also want to use +
instead of *
to ensure that there is at least one digit, or even set a min/max limit on digits using {}
.
Upvotes: 3