kavya8
kavya8

Reputation: 149

Split string with multiple words using regex in python

Suppose I have a expression

exp="\"OLS\".\"ORDER_ITEMS\".\"QUANTITY\"  <50 and  \"OLS\".\"PRODUCTS\".\"PRODUCT_NAME\"  = 'Kingston' or \"OLS\".\"ORDER_ITEMS\".\"QUANTITY\"  <20"

I want to split the expression by and , or so that my result will be

exp=['\"OLS\".\"ORDER_ITEMS\".\"QUANTITY\"  <50','\"OLS\".\"PRODUCTS\".\"PRODUCT_NAME\"  = 'Kingston'','\"OLS\".\"ORDER_ITEMS\".\"QUANTITY\"  <20']

This is what i have tried:

import re
res=re.split('and|or|',exp)

but it will split by each character how can we make it split by word?

Upvotes: 0

Views: 385

Answers (3)

Mehul Gupta
Mehul Gupta

Reputation: 1939

import itertools
exp=itertools.chain(*[y.split('or') for y in exp.split('and')])
exp=[x.strip() for x in list(exp)]

Explanation: 1st split on 'and'. Now try spitting each element obtained on 'or'. This will create list of lists. Using itertools, create a flat list & strip extra spaces from each new element in the flat list

Upvotes: 1

Jiř&#237; Baum
Jiř&#237; Baum

Reputation: 6930

Your regex has three alternatives: "and", "or" or the empty string: and|or|

Omit the trailing | to split just by those two words.

import re
res=re.split('and|or', exp)

Note that this will not work reliably; it'll split on any instance of "and", even when it's in quotes or part of a word. You could make it split only on full words using \b, but that will still split on a product name like 'Black and Decker'. If you need it to be reliable and general, you'll have to parse the string using the full syntax (probably using an off-the-shelf parser, if it's standard SQL or similar).

Upvotes: 1

Julien
Julien

Reputation: 15071

You can do it in 2 steps: [ss for s in exp.split(" and ") for ss in s.split(' or ')]

Upvotes: 0

Related Questions