user930916
user930916

Reputation: 421

Regex to format comma-separated list from 1-per-line to n-per-line

I have a SQL query in a string as follows:

SELECT COL1,
       COL2,
       COL3,
       COL4,
       COL5,
       COL6,
       COL7,
       COL8,
       COL9,
       COL10,
       COL11
  FROM MY_TABLE

And I want to use regex to reformat it to

SELECT COL1, COL2, COL3, COL4,
       COL5, COL6, COL7, COL8,
       COL9, COL10, COL11
  FROM MY_TABLE

(the second line ideally has the indentation intact from before)

so that we reduce the number of lines the query can take especially when we have hundreds of columns. Is this possible? I would be implementing this regex in python if that helps! (A pythonic answer is also acceptable, though we should assume there's multiple Nested queries as well)

Upvotes: 0

Views: 66

Answers (1)

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521457

This took a painfully long time to come up with, but we can try using re.sub here with a callback funtion:

inp = """SELECT COL1,
       COL2,
       COL3,
       COL4,
       COL5,
       COL6,
       COL7,
       COL8,
       COL9,
       COL10,
       COL11
  FROM MY_TABLE"""
output = re.sub(r'\w+(?:,?\n\s*(?!FROM\b)\w+){0,3}', lambda m: ' '.join(m.group().split()), inp)
print(output)

This prints:

SELECT COL1, COL2, COL3, COL4,
       COL5, COL6, COL7, COL8,
       COL9, COL10, COL11
  FROM MY_TABLE

The regex pattern used here says to match:

\w+                a term in the select clause
(?:
    ,?             followed by optional comma
    \n\s*          newline and optional whitespace
    (?!FROM\b)\w+  another term which is NOT 'FROM' (i.e. stop at FROM)
){0,3}             zero to three more select terms (so group of at most 4)

Upvotes: 1

Related Questions