krzyhub
krzyhub

Reputation: 6540

Getting minutes from string via regex

I have task to be done. I have test file for it which is containing code:

import unittest
from Task302 import extract_minutes

class Task302Test(unittest.TestCase):
    """Testy do zadania 302"""

    def test_simple(self):
        """Prosty test."""
        self.assertEqual(extract_minutes("9:13"), "13")
        self.assertEqual(extract_minutes("18:44"), "44")
        self.assertEqual(extract_minutes("23:59"), "59")
        self.assertEqual(extract_minutes("0:00"), "00")
        self.assertEqual(extract_minutes("25:14"), "<NONE>")
        self.assertEqual(extract_minutes("9:61"), "<NONE>")
        self.assertEqual(extract_minutes("x9:13y"), "<NONE>")

I have written code:

def extract_minutes(string):
    pattern = '[0-1]*[0-9]+|2[0-3]:([0-5][0-9])'
    r = re.compile(pattern)
    m = r.search(string)
    if m:
        return m.group(1)
    else:
        return "<NONE>"

Please explain me what is wrong with my code and how to fix it.

Upvotes: 0

Views: 101

Answers (1)

Avinash Raj
Avinash Raj

Reputation: 174696

You need to put the | operator only for the hours. Your regex [0-1]*[0-9]+|2[0-3]:([0-5][0-9]) considers the hours from 0 to 19 as separate part and hours from 20 to 23 plus the minutes as separate part. And also i suggest you to put ? instead of * because * will match the previous token zero or more times, where the ? (except as a non-greedy quantifier) will match the previous token 0 or 1 times. And you must need to remove the + after the character class [0-9] because + matches the previous token one or more times.

pattern = r'\b(?:[0-1]?[0-9]|2[0-3]):([0-5][0-9])\b'

\b called word boundary which matches between a word character and a non-word character. Without word boundaries, it would match this x9:13y string.

DEMO

Upvotes: 2

Related Questions