JodyK
JodyK

Reputation: 105

Regex python: match multi-line float values between brackets

Match grouped multi-line float values between brackets

In the below example data, I want to extract all the float values between brackets belonging only to "group1" using regex, but not the values from other groups ("group2", "group3" etc.). A requirement is that it is done via regex in python. Is this possible with regex at all?


Regex patterns attempts:

I tried the following patterns, but they capture either everything or nothing:

  1. Matches every float value in all groups: ([+-]*\d+\.\d+),
  2. Matches no value in any groups: group1 = \[ ([+-]*\d+\.\d+), \]


What should I do to make this work? Any suggestions would be very welcome!


Example data:

group1 = [
 1.0,
 -2.0,
 3.5,
 -0.3,
 1.7,
 4.2,
]


group2 = [
 2.0,
 1.5,
 1.8,
 -1.8,
 0.7,
 -0.3,
]


group1 = [
  0.0,
  -0.5,
  1.3,
  0.8,
  -0.4,
  0.1,
]

Upvotes: 3

Views: 1284

Answers (2)

Neelix
Neelix

Reputation: 143

Try this:

\bgroup2 = \[([\s+\d+.\d+[,-\]]+)

This probably isn't the most optimized solution but I made it in just a few minutes using this website. http://www.regexr.com/

This is by far the best resource I have found yet for creating regular expressions. It has great examples, reference and a cheat sheet. Paste your example text and you can tweak the regex and see it update in real time. Hover over the expression and it will give you details on each part.

Upvotes: 0

Taku
Taku

Reputation: 33744

Here's a regex I created r'group1 = \[\n([ *-?\d\.\d,\n]+)\]':

import re

s = '''group1 = [
 1.0,
 -2.0,
 3.5,
 -0.3,
 1.7,
 4.2,
]


group2 = [
 2.0,
 1.5,
 1.8,
 -1.8,
 0.7,
 -0.3,
]


group1 = [
  0.0,
  -0.5,
  1.3,
  0.8,
  -0.4,
  0.1,
]'''

groups = re.findall(r'group1 = \[\n([ *-?\d\.\d,\n]+)\]', s)
groups = [float(f) for l in map(lambda p: p.split(','), groups) for f in l if f.strip()]
print(groups)

Output:

[1.0, -2.0, 3.5, -0.3, 1.7, 4.2, 0.0, -0.5, 1.3, 0.8, -0.4, 0.1]

Upvotes: 1

Related Questions