Kosu K.
Kosu K.

Reputation: 75

Python Regex Match floating numbers a certain number of times

I have data that looks like this:

data = '5.2 -34 435, 34 2.908 3, 50 2 54 3, 40 50'

I am trying to write a regex such that each items in a Python list created by re.findall only contains 3 or less numbers bounded by the commas as shown above.

If there are more than 3 numbers bounded by commas, then place the remaining numbers before comma in the next item within a list (as long as it's less than or equal to 3)

Ideally the output for the above data would look like this

['5.2 -34 435','34 2.908 3','50 2 54','3','40 50']

I tried to write up the following looking at tutorials but it doesn't seem to work too well...

re.findall(r"[-+A-z0-9.\s]{3}", data)

Upvotes: 3

Views: 78

Answers (2)

anubhava
anubhava

Reputation: 785128

This can be achieved in a single operation using findall using this regex:

[+-]?\d+(?:\.\d+)?(?:\s+[+-]?\d+(?:\.\d+)?){0,2}

RegEx Demo

  • Here we are using [+-]?\d+(?:\.\d+)? as a pattern for matching a signed number that may or may not be a floating point number.
  • (?:\s+[+-]?\d+(?:\.\d+)?){0,2} matches more 0 to 2 instances of that number Code:
import re
data = '5.2 -34 435, 34 2.908 3, 50 2 54 3, 40 50'
rx = re.compile(r'[+-]?\d+(?:\.\d+)?(?:\s+[+-]?\d+(?:\.\d+)?){0,2}')
print (rx.findall(data))

Output:

['5.2 -34 435', '34 2.908 3', '50 2 54', '3', '40 50']

Upvotes: 3

Will Da Silva
Will Da Silva

Reputation: 7040

No need to use regex for this. The following code should do what you want:

def flatten(x)
    return [item for sublist in x for item in sublist]

def chunk(x, n):
    return [x[i:i + n] for i in range(0, len(x), n)]

data = '5.2 -34 435, 34 2.908 3, 50 2 54 3, 40 50'
chunked = [chunk(x.strip().split(' '), 3) for x in data.split(',')]
output = [' '.join(x) for x in flatten(chunked)] 
>>> output
['5.2 -34 435', '34 2.908 3', '50 2 54', '3', '40 50']

We split the data on the comma, and then split each of those pieces into the numbers by splitting on the space character. Those sublists of numbers are then broken into chunks of 3. We then flatten one level of nesting away, leaving us with a list of lists each containing up to 3 numbers (as strings). To get output we then simply join these chunks of up to 3 numbers together with spaces between them.

Upvotes: 2

Related Questions