Reputation: 23
I would like to match words and numbers and drop all special characters unless it's a period between numbers.
Specifically, I want to have the effect of \W+
except instead of splitting 49.99 into 49 and 99, I want to keep it as 49.99
For example I want
"millie's math house 3-7 (win/mac) now 49.99 only."
to be split into
['millie', 'math', 'house', '3', '7', 'win', 'mac', 'now', '49.99', 'only']
But right now, using just \W+
, I get
['millie', 'math', 'house', '3', '7', 'win', 'mac', 'now', '49', '99', 'only']
How can I keep words that have periods in the middle, but get rid of special characters otherwise?
Thanks!
Upvotes: 2
Views: 2763
Reputation: 2502
Try the following
[^\w.]+
Instead of matching all non words characters, it matches everything that isn't a word character/period.
Upvotes: 2
Reputation: 1867
If you know for a fact there will be a decimal every time in the middle, then this is viable:
(\d+.\d+)
You can check the regular expression here:
Upvotes: 1