narayan nyaupane
narayan nyaupane

Reputation: 313

How can we select string between two single quotes by escaping apostrophe using regex in javascript?

Example Text: 'builder's margin' means the percentage stated in Item 8 of Schedule 1;

Here I have a regex that selects words between two single quotes /'(.*?[^\\])'/g. But it's not working when i try to extract builder's margin from the example text because there is apostrophe so it's only selecting up to builder. so is there any way that we can escape apostrophe and select up to margin??

Upvotes: 1

Views: 412

Answers (3)

georg
georg

Reputation: 214959

Try

'(.+?)'\B

which basically means ', then some content, then ', followed by a non-word char.

a = `normal 'quoted' string`
b = `'builder's margin' means the percentage`
c = `this 'we'll handle'`


re = /'(.+?)'\B/

console.log(a.match(re)[1])
console.log(b.match(re)[1])
console.log(c.match(re)[1])

Note that this doesn't handle trailing apostrophes as in 'builders' margins' are...

Upvotes: 2

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626903

If you do not need to support escape sequences (i.e. if your string is not a string literal presented as plain text), you can use

(?!\b'\b)'([^']*(?:\b'\b[^']*)*)(?!\b'\b)'

Else, if the text you have can contain escape sequences, you can use

(?!\b'\b)'([^'\\]*(?:(?:\\.|\b'\b)[^'\\]*)*)(?!\b'\b)'

See the regex demo #1 and regex demo #2.

NOTE: replace . with [^] to match any chars including line break chars.

Details

  • (?!\b'\b)' - the ' char that is not enclosed with word chars on both ends
  • ([^']*(?:\b'\b[^']*)*) - Capturing group 1:
    • [^']* - zero or more chars other than '
    • (?:\b'\b[^']*)* - zero or more occurrences of a ' not enclosed with word chars and then zero or more chars other than '
  • ([^'\\]*(?:(?:\\.|\b'\b)[^'\\]*)*) - Group 1 (in the second pattern):
    • [^'\\]* - zero or more chars other than ' and \
    • (?:(?:\\.|\b'\b)[^'\\]*)* - zero or more sequences of a \ + any one char or a ' that is enclosed with word chars and then zero or more chars other than ' and \
  • (?!\b'\b)' - the ' char that is not enclosed with word chars on both ends

Upvotes: 1

anubhava
anubhava

Reputation: 785246

Assuming the middle apostrophe has no word character right after it, you can try this regex:

'(?:\\.|'(?=\w)|[^'])*'

RegEx Demo

RegEx Details:

  • ': Start opening '
  • (?:: Start non-capture group
    • \\.: Match \ and an escaped character
    • |: OR
    • '(?=\w): Match ' if it has a word character next to it
    • |: OR
    • [^']: Match any character that is not '
  • )*: End non-capture group. Repeat this group 0 or more times
  • ': Closing '

Upvotes: 1

Related Questions