Shades88
Shades88

Reputation: 8360

A regex to detect string not enclosed in double quotes

I have a string something like this

"quick" "brown" fox jumps "over" "the" lazy dog

I need a regex to detect words not enclosed in double quotes. After some random tries I found this ("([^"]+)"). This detects a string enclosed in double quotes. But I want the opposite. I really can't come up with it even after trying to reverse the above mentioned regex. I am quite weak in regex. Please help me

Upvotes: 13

Views: 9665

Answers (3)

Igor Chubin
Igor Chubin

Reputation: 64563

Use lookahead/lookbehind assertions:

(?<![\S"])([^"\s]+)(?![\S"])

Example:

>>> import re
>>> a='"quick" "brown" fox jumps "over" "the" lazy dog'
>>> print re.findall('(?<![\S"])([^"\s]+)(?![\S"])',a)
['fox', 'jumps', 'lazy', 'dog']

The main thing here is lookahead/lookbehind assertions. You can say: I want this symbol before the expression but I don't want it to be a part of the match itself. Ok. For that you use assertions:

(?<![\S"])abc

That is a negative lookbehind. That means you want abc but without [\S"] before it, that means there must be no non-space character (beginning of the word) or " before.

That is the same but in the other direction:

abc(?![\S"])

That is a negative lookahead. That means you want abc but without [\S"] after it.

There are four differenet assertions of the type in general:

(?=pattern)
    is a positive look-ahead assertion
(?!pattern)
    is a negative look-ahead assertion
(?<=pattern)
    is a positive look-behind assertion
(?<!pattern)
    is a negative look-behind assertion 

Upvotes: 33

Ria
Ria

Reputation: 10347

use this regex:

\s+(?<myword>([^\"\s]+)*)\s+

this should be work; and get group named myword. else you need to trim your result string.

Upvotes: 0

Vilius Gaidelis
Vilius Gaidelis

Reputation: 450

Remove the first quote from the string

Upvotes: -3

Related Questions