Harry Hartley
Harry Hartley

Reputation: 85

Python RegEx help: Splitting string into numbers, characters and whitespace

I am trying to split my string into a list, separating by whitespace and characters but leaving numbers together.
For example, the string:

"1 2 +="  

would end up as:

["1", " ", "2", " " ,"+", "="]    

The code I currently have is

temp = re.findall('\d+|\S', input)  

This seperates the string as intended but does also remove the whitespace, how do I stop this?

Upvotes: 0

Views: 163

Answers (2)

OGHaza
OGHaza

Reputation: 4795

You can use \D to find anything that is not a digit:

\d+|\D

Python:

temp = re.findall(r'\d+|\D', input) 
//Output: ['1', ' ', '2', ' ', '+', '=']

It would also work if you just used . since it'll match the \d+ first anyway. But its probably cleaner not to.

\d+|.

Upvotes: 1

Andrew Clark
Andrew Clark

Reputation: 208475

Just add \s or \s+ to your current regular expression (use \s+ if you want consecutive whitespace characters to be grouped together). For example:

>>> s = "1 2 +="
>>> re.findall(r'\d+|\S|\s+', s)
['1', ' ', '2', ' ', '+', '=']

If you don't want consecutive whitespace to be grouped together, then instead of r'\d+|\S|\s' it would probably make more sense to use r'\d+|\D'.

Upvotes: 3

Related Questions