Reputation: 15
Apologies if a similar question is answered already. But here is what I am struggling with, to get a regular expression replace for.
My input text has this sample
" var1 || literal "
I need a regular expression sub to something like this
" concat var1 , literal "
Essentially, || should be replaced with a comma and the first element should be prefixed with a "concat". I may have multiple such occurances in a given input and so I should replace everywhere.
Here is where I am stuck. I can build the regex pattern, but I am not sure how to substitute it and with that.
re.sub(r'\s{1}[a-zA-Z0-9_]+\s*\|\|\s*[a-zA-Z0-9_]+\s*, '??????', input_string)
I am not sure if this can be done in a single Python statement.
I have an alternative to run through the string in loop and get each occurance and replace it individually without using regular expression.
Thanks in advance. Radha
Upvotes: 1
Views: 3279
Reputation: 522626
You may handle this requirement using re.sub
with a callback function:
sql = "select var1 || literal || var2 from yourTable"
def to_concat(matchobj):
return "concat(" + re.sub(r'\s*\|\|', ',', matchobj) + ")"
sql_out = re.sub(r'\S+(?:\s+\|\|\s+\S+)+', lambda x: to_concat(x.group()), sql)
print(sql + "\n" + sql_out)
This prints:
select var1 || literal || var2 from yourTable
select concat(var1, literal, var2) from yourTable
The idea here is to first match the entire expression involving the ANSI ||
concatenation operator. Then, we pass this to a callback function which then selectively replaces all ||
by comma, as well as forms the function call to concat
.
Upvotes: 2
Reputation: 2455
With the python re
module, you can replace by regex by putting terms in parentheses in the pattern, then replacing then using \1
, \2
etc. in order of the terms.
re.sub(r'\s{1}([a-zA-Z0-9_]+)\s*\|\|\s*([a-zA-Z0-9_]+)\s*', r'concat \1 , \2', input_string)
Upvotes: 0