My PM
My PM

Reputation: 11

RegEx For Multiple Search & Replace

I'm trying to do a search and replace (for multiple chars) in the following string:

VAR=%2FlkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA%2B7G3e8%3D&

One or more of these characters: %3D, %2F, %2B, %23, can be found anywhere (beginning, middle, or end of the string) and ideally, I'd like to search for all of them at once (using one regex) and replace them with = or / or + or # respectively, then return the final string.

Example 1:

VAR=%2FlkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA%2B7G3e8%3D&

Should return

VAR=/lkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA+7G3e8=&

Example 2:

VAR=s2P0n6I%2Flonpj6uCKvYn8PCjp%2F4PUE2TPsltCdmA%3DRQPY%3D&

Should return

VAR=s2P0n6I/lonpj6uCKvYn8PCjp/4PUE2TPsltCdmA=RQPY=&

Upvotes: 1

Views: 2579

Answers (5)

Patrick Collins
Patrick Collins

Reputation: 10574

This can't be done with a regex because there's no way to write any kind of conditional inside of a regex. Regular expressions can only answer the question "Does this string match this pattern?" and not perform the operation "If this string matches this pattern, replace part of it with this. If it matches this pattern, replace it with this. etc..."

Upvotes: -1

Ofir Israel
Ofir Israel

Reputation: 3913

I'm also not convinced you need regex, but there's a better way to do url-decode with regex. Basically you need that every string in the pattern of %XX will be converted into the char it represents. This can be done with re.sub() like so:

>>> VAR="%2FlkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA%2B7G3e8%3D&"
>>> re.sub(r'%..', lambda x: chr(int(x.group()[1:], 16)), VAR)
'/lkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA+7G3e8=&'

Enjoy.

Upvotes: 2

Israel Unterman
Israel Unterman

Reputation: 13510

There is no need for a regex to do that in your example. A simple replace method will do:

def rep(s):
    for pat, txt in [['%2F','/'], ['%2B','+'], ['%3D','='], ['%23','#']]:
        s = s.replace(pat, txt)
    return s

Upvotes: 2

Sérgio
Sérgio

Reputation: 7249

var = "VAR=s2P0n6I%2Flonpj6uCKvYn8PCjp%2F4PUE2TPsltCdmA%3DRQPY%3D&"
var = var.replace("%2F", "/")
var = var.replace("%2B", "+")
var = var.replace("%3D", "=")

but you got same result with urllib2.unquote

import urllib2
var = "VAR=s2P0n6I%2Flonpj6uCKvYn8PCjp%2F4PUE2TPsltCdmA%3DRQPY%3D&"
var = urllib2.unquote(var)

Upvotes: 1

Wolph
Wolph

Reputation: 80031

I'm not convinced you need regex for this, but it's fairly easy to do with Python:

x = 'VAR=%2FlkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA%2B7G3e8%3D&'

import re

MAPPING = { 
    '%3D': '=',
    '%2F': '/',
    '%2B': '+',
    '%23': '#',
}

def replace(match):
    return MAPPING[match.group(0)]

print x
print re.sub('%[A-Z0-9]{2}', replace, x)

Output:

VAR=%2FlkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA%2B7G3e8%3D&
VAR=/lkdMu9zkpE8w7UKDOtkkHhJlYZ6CaEaxqmsA+7G3e8=&

Upvotes: 2

Related Questions