Alice Everett
Alice Everett

Reputation: 375

How to ignore quotes while splitting strings python

I want to split the below mentioned string:

  lin=' <abc<hd <> "abc\"d\" ef" '

into

 [<abc<hd <>,  "abc\"d\" ef"]

However my problem is when I split the string using re.findall(r'"(.*?)"', lin, 0). I get

['abc', 'ef'] 

Can someone please guide me as to how can I split the string in Python?

Upvotes: 0

Views: 1139

Answers (4)

edi_allen
edi_allen

Reputation: 1872

Here is a solution using regular expression.

import re
line = ' <abc<hd <> "abc\"d\" ef" ' 

match = list(re.findall(r'(<[^>]+>)\s+("(?:\"|[^"])+")', line)[0])

print(match)
#['<abc<hd <>', '"abc"d" ef"']

Another way to do it.

print(re.split(r'\s+(?=")', line.strip())) #split on white space only if followed by a quote.
#['<abc<hd <>', '"abc"d" ef"']     

Upvotes: 4

thefourtheye
thefourtheye

Reputation: 239653

Yet another RegEx solution

import re
lin=' <abc<hd <> "abc\"d\" ef" '
matching = re.match("\s+(.*?)\s+(\"(.*)\")", lin)
print [matching.group(1), matching.group(2)]

Output

['<abc<hd <>', '"abc"d" ef"']

Upvotes: 1

TerryA
TerryA

Reputation: 60024

Firstly, you have some extra whitespace on the beginning and end of your string, so doing lin .strip() will remove that.

You can then use str.split() to split at the first ":

>>> lin.strip().split(' "', 1)
['<abc<hd <>', 'abc"d" ef"']

The 1 we use as a second argument tells python to only split it once, and so not split at every other ".

Upvotes: 3

user2665694
user2665694

Reputation:

>>> lin=' <abc<hd <> "abc\"d\" ef" '
>>> lin.split('"', 1)
[' <abc<hd <> ', 'abc"d" ef" ']

Upvotes: 0

Related Questions