Techsearch
Techsearch

Reputation: 57

regex find double or single quotes in start and end of each word

I am trying to find out who many words or set of words are enclosed either in single quotes or double quotes.

I tested it using the below regex pattern for double quotes. However, the issue remains same, even if I have a word starting with single double quote and ending with two double quotes, it is giving me the output value. I am expecting other than enclosed with two double quotes for each word what ever extra quotes are there should find as I have to remove those extra quotes.

f = '"country id""   "state id"'

print(re.findall('^".*["\s"][a-z].*"$',f))

Upvotes: 2

Views: 503

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627607

You can use

^\s*"[^"]*"(?:\s*"[^"]*")*\s*$

See the regex demo. Details:

  • ^ - start of string
  • \s* - zero or more (here, leading) whitespaces
  • "[^"]*" - a ", zero or more chars other than ", and then a "
  • (?:\s*"[^"]*")* - zero or more sequences of zero or more whitespaces and then substrings between " chars having no other " inside them
  • \s* - zero or more (here, trailing) whitespaces
  • $ - end of string.

If there are escape sequences, you will need to amend it to

^\s*"[^"\\]*(?:\\.[^"\\]*)*"(?:\s*"[^"\\]*(?:\\.[^"\\]*)*")*\s*$

See this regex demo.

Here, "[^"\\]*(?:\\.[^"\\]*)*" is used instead of "[^"]*" to match

  • " - a " char
  • [^"\\]* - zero or more chars other than " and \
  • (?:\\.[^"\\]*)* - zero or more sequences of any escaped char (other than a line break char) and then zero or more chars other than " and \
  • " - a " char

Upvotes: 1

Related Questions