Reputation: 107
I have the following code:
regularexpression = r'([-\w]*\w)? ?: ?([-"\#\w\s_]*\w?);'
outputfr = re.findall(regularexpression, inputdata, re.IGNORECASE)
return data
It's supposed to catch words, hyphens and other characters, ending in ";". So:
(hello-nine: hello, six, seven; hello-five: six eight)
would output as [('hello-nine', 'hello, six, seven'), ('hello-five', 'six eight')
If final-number: "seventy", "sixty", "fifty", forty
is part of the user input (inputdata), regularexpression doesn't catch it. I'd want it to output as [('final-number', '"seventy", "sixty", "fifty", "forty")]
Why is this?
Upvotes: 0
Views: 75
Reputation: 12316
Your example inputs -> outputs are not consistent. In the first case, the comma-separated items are kept together but in the second they are separate list elements. Also, do you want to strip parentheses? quote marks? Clarify by giving actual values for inputdata
and showing what exactly you want to return (including stripping quote marks, parentheses). The data
variable is never assigned.
Using .split(";")
might be a better starting point...
inputdata = "(hello-nine: hello, six, seven; hello-five: six eight)"
mylist = inputdata.split(";")
# here either use regexp or another split, depending on what you want...
subset = [x.split(":") for x in mylist]
Upvotes: 0
Reputation: 2826
In your regular expression, the second group:
([-"\#\w\s_]*\w?)
needs to be changed so that it will match commas:
([-"\#\w\s_,]*\w?)
Upvotes: 3