Python - Regex - Match anything except

Question

I'm trying to get my regular expression to work but can't figure out what I'm doing wrong. I am trying to find any file that is NOT in a specific format. For example all files are dates that are in this format MM-DD-YY.pdf (ex. 05-13-17.pdf). I want to be able to find any files that are not written in that format.

I can create a regex to find those with:

(\d\d-\d\d-\d\d\.pdf)

I tried using the negative lookahead so it looked like this:

(?!\d\d-\d\d-\d\d\.pdf)

That works in not finding those anymore but it doesn't find the files that are not like it.

I also tried adding a .* after the group but then that finds the whole list.

(?!\d\d-\d\d-\d\d\.pdf).*

I'm searching through a small list right now for testing:

05-17-17.pdf Test.pdf 05-48-2017.pdf 03-14-17.pdf

Is there a way to accomplish what I'm looking for?

Thanks!

Ajax1234 · Accepted Answer

You can try this:

import re
s = "Test.docx 04-05-2017.docx 04-04-17.pdf secondtest.pdf"

new_data = re.findall("[a-zA-Z]+\.[a-zA-Z]+|\d{1,}-\d{1,}-\d{4}\.[a-zA-Z]+", s)

Output:

['Test.docx', '04-05-2017.docx', 'secondtest.pdf']

Python - Regex - Match anything except

Answers (2)

Related Questions