Meera
Meera

Reputation: 33

Regex in python splitting strings

I have a string like this

SELECT [Orders$].[Category] AS [Category],&#13,&#10,  [Orders$].[City] AS [City],&#13,&#10,  [Orders$].[Country] AS [Country],&#13,&#10,  [Orders$].[Customer ID] AS [Customer ID],&#13,&#10,  [Orders$].[Customer Name] AS [Customer Name],&#13,&#10,  [Orders$].[Discount] AS [Discount],&#13,&#10,  [Orders$].[Profit] AS [Profit],&#13,&#10,  [Orders$].[Quantity] AS [Quantity],&#13,&#10,  [Orders$].[Region] AS [Region],&#13,&#10,  [Orders$].[State] AS [State],&#13,&#10,  [People$].[Person] AS [Person],&#13,&#10,  [People$].[Region] AS [Region (People)]&#13,&#10,FROM [Orders$]&#13,&#10,  INNER JOIN [People$] ON [Orders$].[Region] = [People$].[Region]

I want to get only Category and city dynamically without hardcoding the word . What kind of pattern should i use ?? So that i will store those two values in an array which is looped in downstream program .

I tried splitting the text

colName = re.split("\W+", result)

['SELECT',
 'Orders',
 'Category',
 'AS',
 'Category',
 '13',
 '10',
 'Orders',
 'City',
 'AS',
 'City',
 '13',
 '10',

it gave me the whole string , now do not know how to proceed . Can someone help ??

Thanks

Upvotes: 1

Views: 205

Answers (2)

Barmar
Barmar

Reputation: 780724

Don't use split, use re.findall().

matches = re.findall(r'\bAS\s+\[(.+?)\]', yourString)

The words you want are in group(1) of each match in matches.

Upvotes: 1

Lucecpkn
Lucecpkn

Reputation: 1131

Not sure if I understand your question correctly, seems you can simply continue with:

>>> category = colName[2]
>>> city = colName[8]

You can print to check:

>>> print(category, city)
Category City

Upvotes: 0

Related Questions