Reputation: 81
a='a/20191101/6ca9ae66ebfe2c3040eb3df6148a5a4d.pdf'
b='a/20170310/402006fceb4cbad4c3bf6a89002249dd.pdf'
c='a/20161125/237ebceb094d5f92bbce65e2526339e0.pdf'
d='a/20180629/b990c7c01736f6cee10169140bb304b7.pdf'
The above strings are part of the pdf URL links, I want to use regex to match them, but I only get one part,
r'[a-z][/]\d+[/]'
I didn't know how to write the last part, which is a random mix of numbers and lowercase letters. Please help me!
Upvotes: 1
Views: 857
Reputation: 1179
Just use wildcards and check for upper and lowercase letters and digits. Or if you want to capture anything before .pdf, you can just use .*[.]pdf. Use regex101.com to test your regex. See this example:
[a-z]/\d+/[A-Za-z0-9]*[.]pdf
[a-z]/\d+/.*[.]pdf
Upvotes: 0
Reputation: 819
The last portion is in lowercase hexadecimal, so we just need to create a group with both digits and the letters a-f
, and then just add a .pdf
at the end (remember to escape the period as it's a special character).
[a-z][/]\d+[/][\da-f]+\.pdf
I don't fully understand your use of brackets in cases where they're unnecessary. This version removes many of them and also works.
[a-z]/\d+/[\da-f]+\.pdf
Upvotes: 1