boulderz
boulderz

Reputation: 133

Python regular expression sub

I am using this sub:

def camelize(key):
    print re.sub(r"[a-z0-9]_[a-z0-9]", underscoreToCamel, key)

Which calls this function

def underscoreToCamel(match):
    return match.group()[0] + match.group()[2].upper()

When I call camelize('sales_proj_3_months_ago') it returns 'salesProj3_monthsAgo' instead of 'salesProj3MonthsAgo.' However, if I call `camelize('sales_proj_30_days_ago') it returns 'salesProj30DaysAgo' as expected.

So there is a problem with my regex substitution when there is only one character in between underscores. How can I write my regex substitution to account for these cases?

Upvotes: 0

Views: 50

Answers (2)

Moon Cheesez
Moon Cheesez

Reputation: 2701

Your code matches as so:

s_p
j_3
s_a

As you can see, _3_ is not matched because it was previously matched. So you can actually just match one character:

def camelize(key):
    print re.sub(r"_[a-z0-9]", underscoreToCamel, key)

def underscoreToCamel(match):
    return match.group()[1].upper()

Sample Outputs:

>>> camelize("sales_proj_3_months_ago")
salesProj3MonthsAgo
>>> camelize('sales_proj_30_days_ago')
salesProj30DaysAgo

Upvotes: 0

kennytm
kennytm

Reputation: 523304

You could use look-behind so that each match does not overlap with the previous one.

def camelize(key):
    return re.sub('(?<=[a-z0-9])_[a-z0-9]', lambda m: m.group()[1].upper(), key)

Upvotes: 1

Related Questions