S420L
S420L

Reputation: 117

Python Regex to find any word within a string that has a comma

I'm trying to process some SQL code to find the parts of a select statement that would need to be grouped farther down in a query. For example:

In the string "Select person, age, name, sum(count distinct arrests) from..."

I would want "sum(count" returned, because it's the only part of this string that has white space on either side and includes an open parenthesis.

I have been trying different things but am struggling.

I've tried re.compile(r'\W.*[)]') and am getting either way too much back or nothing at all.

Upvotes: 2

Views: 162

Answers (3)

A l w a y s S u n n y
A l w a y s S u n n y

Reputation: 38502

How about a non-regex way with split() and list-comprehension

some_list = "Select person, age, name, sum(count distinct arrests) from...".split(' ')
matching = [s for s in some_list if "(" in s][0]
print(matching) # sum(count


some_list = "COUNT(DISTINCT(case when etc...)".split(' ')
matching = [s for s in some_list if "(" in s][0]
print(matching) # COUNT(DISTINCT(case

WORKING DEMO: https://rextester.com/ZKJU83182

Upvotes: 0

The fourth bird
The fourth bird

Reputation: 163362

If the match can also occur at the start of the string, you could use lookarounds to assert what is on the left and in the right is not a non whitespace char \S and use a repeating group (?:...)+ to match that 1+ times.

(?<!\S)(?:\w+\(\w+)+(?!\S)

Regex demo

That will match COUNT(DISTINCT(case and sum(count

Upvotes: 0

Rakesh
Rakesh

Reputation: 82765

Use pattern (\w+\(\w+)\s+

Ex:

import re

s = "Select person, age, name, sum(count distinct arrests) from..."
print(re.search(r"(\w+\(\w+)\s+", s).group(1))

Output:

sum(count

Upvotes: 1

Related Questions