Reputation: 11
I'm trying to do a search and replace (for multiple chars) in the following string:
VAR=%2FlkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA%2B7G3e8%3D&
One or more of these characters: %3D, %2F, %2B, %23, can be found anywhere (beginning, middle, or end of the string) and ideally, I'd like to search for all of them at once (using one regex) and replace them with = or / or + or # respectively, then return the final string.
Example 1:
VAR=%2FlkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA%2B7G3e8%3D&
Should return
VAR=/lkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA+7G3e8=&
Example 2:
VAR=s2P0n6I%2Flonpj6uCKvYn8PCjp%2F4PUE2TPsltCdmA%3DRQPY%3D&
Should return
VAR=s2P0n6I/lonpj6uCKvYn8PCjp/4PUE2TPsltCdmA=RQPY=&
Upvotes: 1
Views: 2579
Reputation: 10574
This can't be done with a regex because there's no way to write any kind of conditional inside of a regex. Regular expressions can only answer the question "Does this string match this pattern?" and not perform the operation "If this string matches this pattern, replace part of it with this. If it matches this pattern, replace it with this. etc..."
Upvotes: -1
Reputation: 3913
I'm also not convinced you need regex, but there's a better way to do url-decode with regex. Basically you need that every string in the pattern of %XX will be converted into the char it represents. This can be done with re.sub()
like so:
>>> VAR="%2FlkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA%2B7G3e8%3D&"
>>> re.sub(r'%..', lambda x: chr(int(x.group()[1:], 16)), VAR)
'/lkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA+7G3e8=&'
Enjoy.
Upvotes: 2
Reputation: 13510
There is no need for a regex to do that in your example. A simple replace method will do:
def rep(s):
for pat, txt in [['%2F','/'], ['%2B','+'], ['%3D','='], ['%23','#']]:
s = s.replace(pat, txt)
return s
Upvotes: 2
Reputation: 7249
var = "VAR=s2P0n6I%2Flonpj6uCKvYn8PCjp%2F4PUE2TPsltCdmA%3DRQPY%3D&"
var = var.replace("%2F", "/")
var = var.replace("%2B", "+")
var = var.replace("%3D", "=")
but you got same result with urllib2.unquote
import urllib2
var = "VAR=s2P0n6I%2Flonpj6uCKvYn8PCjp%2F4PUE2TPsltCdmA%3DRQPY%3D&"
var = urllib2.unquote(var)
Upvotes: 1
Reputation: 80031
I'm not convinced you need regex for this, but it's fairly easy to do with Python:
x = 'VAR=%2FlkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA%2B7G3e8%3D&'
import re
MAPPING = {
'%3D': '=',
'%2F': '/',
'%2B': '+',
'%23': '#',
}
def replace(match):
return MAPPING[match.group(0)]
print x
print re.sub('%[A-Z0-9]{2}', replace, x)
Output:
VAR=%2FlkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA%2B7G3e8%3D&
VAR=/lkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA+7G3e8=&
Upvotes: 2