Prashanth Benny
Prashanth Benny

Reputation: 1609

Ignoring apostrophe while capturing contents in single quotes REGEX

The issue for me here is to capture the content inside single quotes(like 'xyz').
But the apostrophe which is the same symbol as a single quote(') is coming in the way!

The regex I've written is : /(\w\'\w)(*SKIP)(*F)|(\'[^\']*\')/

The example i have used is : Hello ma'am 'This is Prashanth's book.'

What needs to be captured is : 'This is Prashanth's book.'.

But, what's capured is : 'This is Prashanth'!

Here is the link of what i tried on online regex tester

Any help is greatly appreciated. Thank you!

Upvotes: 0

Views: 117

Answers (2)

Pushpesh Kumar Rajwanshi
Pushpesh Kumar Rajwanshi

Reputation: 18357

You can't use [^\'] to capture a text that contains ' with in and in your example, This is Prashanth's book. contains a ' character within the text. You need to modify your regex to use .*? instead of [^\'] and can write your regex as this,

(\w'\w)(*SKIP)(*F)|('.*?'\B)

Demo with your updated regex

Also, you don't need to escape a single quote ' as that has no special meaning in regex.

From your example, it is not clear whether you want the captured match to contain ' around the match or not. In case you don't want ' to be captured in the match, you can use a lookarounds based regex and use this,

(?<=\B').*?(?='\B)

Explanation of regex:

  • (?<=\B') - This positive look behind ensures what gets captured in match is preceded by a single quote which is not preceded by a word character which is ensured by \B
  • .*? - Captures the text in non-greedy manner
  • (?='\B) - Ensures the matched text is followed by a single quote and \B ensures it doesn't match a quote that is immediately followed by any word character. E.g. it won't match an ending quote like 's

Demo

Upvotes: 1

Gurmanjot Singh
Gurmanjot Singh

Reputation: 10360

For the string you have provided, you can use the regex:

\B'\K(?:(?!'\B).)+

Click for Demo

Explanation:

  • \B - a non-word boundary
  • ' - matches a '
  • \K - forget everything matched so far
  • (?:(?!'\B).)+ - matches 1+ occurrences of any character(except newline) which does not start with ' followed by a non-word boundary

Upvotes: 1

Related Questions