Reputation: 1432
I'm reading in some data in the form a list of items, space separated. Each item has a name, which may be one or more words, and a quantity which can either be a single integer or a fraction of integers.
Ex.
'12 Spruce Log 4/5 Water 3 Orange 3/18 Oak Plank'
I want this split into the following list:
['12 Spruce Log', '4/5 Water', '3 Orange', '3/18 Oak Plank']
Here is my Python regex:
import re
re.findall(r'\d+(/\d+)?\D+', "12 Spruce Log 4/5 Water 3 Orange 3/18 Oak Plank")
This produces the following result, which is obviously not right:
['', '/5', '', '/18']
What is the proper regex here?
Upvotes: 1
Views: 197
Reputation: 3865
So here is what i came up with:
/(?:\d+\/\d+|\d+)\s(?:[[:word:]]+\s*){1,2}(?=\d|$)/g
Upvotes: 0
Reputation: 71451
You can try this:
import re
s = '12 Spruce Log 4/5 Water 3 Orange 3/18 Oak Plank'
new_s = re.split('(?<=[a-zA-Z])\s(?=\d)', s)
Output:
['12 Spruce Log', '4/5 Water', '3 Orange', '3/18 Oak Plank']
Or, just using re.findall
:
new_list = [i[:-1] if i.endswith(' ') else i for i in re.findall('[\d\/]+\s[a-zA-Z\s]+(?=\d)|[\d\/]+\s[a-zA-Z\s]+(?=$)', s)]
Output:
['12 Spruce Log', '4/5 Water', '3 Orange', '3/18 Oak Plank']
Upvotes: 4