Thomas
Thomas

Reputation: 65

Python regex split on plus or minus and keep character

I have a set of data like this:

data_list = ['0+.25+4.06+5.12', '0+0-.033+933.00+9+48.002']

The only delimiters are the plus and minus signs. I want to keep the plus or minus signs but still split on them. The first 0 in front of the element also is not needed.

Here's what I have so far:

import re
    
data_list = ['0+.25+4.06+5.12', '0+0-.033+933.00+9+48.002']
data_string = ""
for item in data_list:
    data_string += item[1:]
data_string = re.split(', |\+|-', data_string)
new_data_list = []
for item in data_string:
    if item:
        new_data_list.append(item)

print(new_data_list)

This gives me close to the right output:

['.25', '4.06', '5.12', '0', '.033', '933.00', '9', '48.002']

but now I cannot determine which one is positive or negative.

I would like output to be like this:

['.25', '4.06', '5.12', '0', '-.033', '933.00', '9', '48.002']

where I can see that .033 is a negative number.

Upvotes: 1

Views: 1456

Answers (3)

anubhava
anubhava

Reputation: 785631

It can be done in a single findall without any loop:

import re
data_list = ['0+.25+4.06+5.12', '0+0-.033+933.00+9+48.002']

print (re.findall(r'-?(?!0+[ +])\d*\.?\d+', ' '.join(data_list)))

Output:

['.25', '4.06', '5.12', '0', '-.033', '933.00', '9', '48.002']

RegEx Demo

RegEx Details:

  • -?: Match optional -
  • (?!0+[ +]): Negative lookahead to fail the match if we have just 0s in input
  • \d*\.?\d+: Match an integer ot floating point number

Upvotes: 2

lemon
lemon

Reputation: 15492

You could try with this list comprehension:

[el for el in re.findall('[+-]\d*\.?\d+', ''.join(data_list))]

Regex explanation:

  • [+-]: beginning symbol
  • \d*: optional numbers
  • \.?: optional dot
  • \d+: decimal numbers

Upvotes: 0

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627100

You can use

import re
 
data_list = ['0+.25+4.06+5.12', '0+0-.033+933.00+9+48.002']
new_data_list = []
for item in data_list:
    new_data_list.extend(re.split(r'\+|(?=-)', item[2:]))
 
print(new_data_list)
# => ['.25', '4.06', '5.12', '0', '-.033', '933.00', '9', '48.002']

See the Python demo.

Note:

  • item[2:] - truncates the first two chars (if you need more precision, replace item[2:] with re.sub(r'^0\+', '', item))
  • \+|(?=-) matches a + or a location that is immediately followed with a - char.

Upvotes: 2

Related Questions