user3461464
user3461464

Reputation: 35

how to search exact word in python?

I wrote this code to search for an exact word in a text (%PDF-1.1)

import re
x = "%PDF-1.1 pdf file contains four parts one of them the header part which looks like "
s = re.compile("%PDF-\d\.\d[\b\s]") 
match = re.search("%PDF-\d\.\d[\b\s]",x)
if match:
    print match.group()
else:
    print "its not found"

but the problem is if I have "s%PDF-1.1" it returns the result %PDF-1.1 but it is wrong and when x = "pdf file contains four parts one of them the header part which looks like %PDF-1.1" it gives me nothing

how could I search the exact word ????

Upvotes: 0

Views: 867

Answers (1)

Ruben Bermudez
Ruben Bermudez

Reputation: 2323

At the moment, you are searching for the word "%PDF-X-X" (Where X is a number) followed by something more without caring about what come before it. If you want to search this word only at the beginning, end of the string or if it is a word (I assume with a space before and after it) you can try this:

import re
x = "%PDF-1.1 pdf file contains four parts one of them the header part which looks like "
y = "pdf file contains four parts one of them the header part which looks like %PDF-1.1"
s = re.compile("(^|\s)(?P<myword>%PDF-\d\.\d)($|\s)") 
match = s.search(x)
if match:
    print match.group("myword")
else:
    print "its not found"

match = s.search(y)
if match:
    print match.group("myword")
else:
    print "its not found"

# %PDF-1.1
# %PDF-1.1

If you want that the word is also found if it is followed by a symbol, you can make something like this, that allow that it is followed by anything that is not a letter or a number:

s = re.compile("(^|\s)(?P<myword>%PDF-\d\.\d)($|\s|[^a-zA-Z0-9])") 

Upvotes: 1

Related Questions