python re.findall vs re.sub

Question

Please explain me why I get different results with using re.find and re.sub

The string which I parse:

GRANT USAGE ON *.* TO 'testuser'@'10.10.10.10' IDENTIFIED BY PASSWORD '*A78AF560CD6F8FEA4DC8205299927B6CB1B1F56A'

Code:

import re

S="GRANT USAGE ON *.* TO 'testuser'@'10.10.10.10' IDENTIFIED BY PASSWORD '*A78AF560CD6F8FEA4DC8205299927B6CB1B1F56A'"

U=re.compile(r'.* TO \'(.*?)\'@.*')
H=re.compile(r'.*\'@\'(.*?)\'.*')

print(U.findall(S))
print(H.findall(S))

So I get what I want:

['testuser']  
['10.10.10.10']

So, I want to change ip address and user, so I try to use re.sub

Code

import re
S="GRANT USAGE ON *.* TO 'testuser'@'10.10.10.10' IDENTIFIED BY PASSWORD '*A78AF560CD6F8FEA4DC8205299927B6CB1B1F56A'"

U=re.compile(r'.* TO \'(.*?)\'@.*')
H=re.compile(r'.*\'@\'(.*?)\'.*')

HOST=H.sub('another_ip',S) 
USER=U.sub('another_user',S)
print(HOST)
print(USER)

But I just get this:

another_ip
another_user

alecxe · Accepted Answer

With re.sub() you need to specifically target which part of the string are you trying to substitute. In other words, re.sub() would replace everything that was matched by a regular expression (well, strictly speaking, the leftmost non-overlapping occurrence of a pattern) - in your case you are replacing the complete string. Instead, you can match the user and the IP address specifically, for example:

>>> re.sub(r"'(\w+)'@'(\d+\.\d+\.\d+\.\d+)'", "'another_user'@'another_ip'", S)
"GRANT USAGE ON *.* TO 'another_user'@'another_ip' IDENTIFIED BY PASSWORD '*A78AF560CD6F8FEA4DC8205299927B6CB1B1F56A'"

python re.findall vs re.sub

Answers (1)

Related Questions