anarchy
anarchy

Reputation: 5184

Filtering a list of strings using regex

I have a list of strings that looks like this,

strlist = [
            'list/category/22',
            'list/category/22561',
            'list/category/3361b',
            'list/category/22?=1512',
            'list/category/216?=591jf1!',
            'list/other/1671',
            'list/1y9jj9/1yj32y',
            'list/category/91121/91251',
            'list/category/0027',
]

I want to use regex to find the strings in this list, that contain the following string /list/category/ followed by an integer of any length, but that's it, it cannot contain any letters or symbols after that.

So in my example, the output should look like this

list/category/22
list/category/22561
list/category/0027

I used the following code:

newlist = []
for i in strlist:
    if re.match('list/category/[0-9]+[0-9]',i):
        newlist.append(i)
        print(i)

but this is my output:

list/category/22
list/category/22561
list/category/3361b
list/category/22?=1512
list/category/216?=591jf1!
list/category/91121/91251
list/category/0027

How do I fix my regex? And also is there a way to do this in one line using a filter or match command instead of a for loop?

Upvotes: 1

Views: 1922

Answers (1)

user7571182
user7571182

Reputation:

You can try the below regex:

^list\/category\/\d+$

Explanation of the above regex:

^ - Represents the start of the given test String.

\d+ - Matches digits that occur one or more times.

$ - Matches the end of the test string. This is the part your regex missed.

Demo of the above regex in here.

IMPLEMENTATION IN PYTHON

import re
pattern = re.compile(r"^list\/category\/\d+$", re.MULTILINE)
match = pattern.findall("list/category/22\n"
               "list/category/22561\n"
               "list/category/3361b\n"
               "list/category/22?=1512\n"
               "list/category/216?=591jf1!\n"
               "list/other/1671\n"
               "list/1y9jj9/1yj32y\n"
               "list/category/91121/91251\n"
               "list/category/0027") 
print (match)

You can find the sample run of the above implementation here.

Upvotes: 3

Related Questions