Reputation: 2697
I've a text like the following
This is a first question and can go to multiple paragraphs.
Multiple lines. etc.
(1)First Option (2) Second Option (3) Third option (4) Fourth Option (5) None of these
8 × ? = 4888 ÷ 4
(1) 150.75 (2) 125.75 (3) 125.05 (4) 152.75 (5) None of these
(62.5 × 14 × 5) ÷ 25 + 41 =
(1) 4 (2) 5 (3) 9 (4) 8 (5) 6
(23 × 23 × 23 × 23 × 23 × 23)×
(1) 32 (2) 30 (3) 9 (4) 7 (5) 11
I would like to parse this into different parts so that I can iterate in a for loop and get each question and also iterate over each answers. The rule is that every question will start with an integer at the start of line (^) followed by a dot. The answers will be prefixed by integers 1 to 5 surrounded by brackets (1-5).
I would like the parsed data say for ex something like:
for item in parsed_data:
print item.text
for answer in item.answers:
print answer.text
How to do this using python regex?
Upvotes: 0
Views: 82
Reputation: 17829
honestly, you can just use re.split()
for this:
#text is the variable with your text
text = text.strip()
questions = re.split(r'\d+\.',text)
questions = [x.strip() for x in questions if x != '']
final = [re.split(r'\(\d+\)',x) for x in questions]
for part in final:
question = part[0]
print question
for answer in part[1:]:
print answer
Upvotes: 1