Reputation: 827
As an assignment I was faced with the question:
Given an input string similar to the below, craft a regular expression pattern to match and extract the date, time, and temperature in groups and return this pattern. Samples given below.
Date: 12/31/1999 Time: 11:59 p.m. Temperature: 44 F
Date: 01/01/2000 Time: 12:01 a.m. Temperature: 5.2 C
So I opened regex101 and created this pattern which tests correctly:
def q6(strng):
import re
pattern = '((?<=Date: )\d{1,2}\/\d{1,2}\/\d{4})|((?<=Time: )?\d{1,2}:\d{1,2} ?[pPaA].?[mM].?)|((?<=Temperature: )\d{1,3}.?\d{1,3} ?[CF])'
print(re.findall(pattern, strng))
return pattern
q6("Date: 12/31/1999 Time: 11:59 p.m. Temperature: 44 F")
q6("Date: 01/01/2000 Time: 12:01 a.m. Temperature: 5.2 C")
but in python the pattern seems to give a flawed answer:
[('12/31/1999', '', ''), ('', '11:59 p.m.', ''), ('', '', '44 F')]
[('01/01/2000', '', ''), ('', '12:01 a.m.', ''), ('', '', '5.2 C')]
You can see the extra empty items in the tuples returned. This question will be graded via program and if you notice the question asks for the pattern to be returned, not the result, therefore no trimming is possible.
Am I just using the wrong regex match function or what have I done wrong?
Upvotes: 1
Views: 870
Reputation: 9377
TL;DR: You named a solution - quoting from last sentence of your question: match
function 😉️
From the docs re.findall(pattern, string, flags=0)
:
Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result.
Changed in version 3.7: Non-empty matches can now start just after a previous empty match.
I highlighted the portions that fit your case in bold.
Simply split your task (text) by and
to get these 3 requirements:
Broken-down into sub-tasks, the task or problem becomes easily solvable. Plus, as a result these steps will guide you to the solution.
This problem-solving strategy is known as divide and conquer.
Now try to solve step by step, starting with (1), then (2), finally (3).
re
gex string (r''
) and (b) compile
to a pattern to (c) match
groups
(all 3 parts put inside parentheses) can be extracted (all at once, but only if given string matches the pattern)Sorry, that I haven't presented you the perfect solution. But you are very close. As far as I can see, those clues given will get you there.
I gave you a step-wise recipe plus keywords which you can use to search on Stackoverflow:
[python] regex extract groups
They are all in your given task:
Given an input string similar to the below, craft a regular expression pattern to match and extract the date, time, and temperature in groups and return this pattern. Samples given below.
Analyzing the problem, identifying keywords, clarifying broad/vague specifications so that you are able to research and collect ingredients form 80% of designing software. Whereas cooking and coding fill up the remaining 20%.
Upvotes: 1
Reputation: 158
You should add ?:
to parentheses which you don't want capture:
(?:.....)|(?:....)|(?:...)
Upvotes: 2