dragonfly02
dragonfly02

Reputation: 3669

remove double single quotes but not single quote

How to remove double single quotes but not the single quote in a string in Ruby? For instance from That's 'large', to That's large.

Upvotes: 0

Views: 203

Answers (3)

Gurmanjot Singh
Gurmanjot Singh

Reputation: 10360

Try this regex:

\B'((?:(?!'\B)[\s\S])*)'

Replace each match with \1

Click for Demo

Code(Result):

re = /\B'((?:(?!'\B)[\s\S])*)'/m
str = 'That\'s \'large\'
The 69\'ers\' drummer won\'t like this.
He said, \'it\'s clear this does not work\'. It does not fit the \'contractual obligations\''
subst = '\\1'

result = str.gsub(re, subst)

# Print the result of the substitution
puts result

Explanation:

  • \B - matches a non-word boundary
  • ((?:(?!'\B)[\s\S])*) - matches 0+ occurrences of any character [\s\S] which (does not start with ' followed by a non-word boundary). This is captured in Group 1.
  • ' - matches a '

Upvotes: 4

Schwern
Schwern

Reputation: 164689

This is one of those quagmires ike parsing XML or HTML that can't be done with a regex, but you can sorta pretend like it's mostly going to work. You can tweak it forever and not get right.

You could look for balanced quotes, that is only quotes in pairs, but this doesn't help. Is That's 'large' to be stripped as Thats large' or That's large?

Instead you need to give it an understanding of English grammar and when a ' is an apostrophe versus a quote. Something simple that knows the basics of contractions and possessives. Contractions: don't, won't, I'll. Possessives: Joe's and s'. And maybe you can knock up a regex to skip those.

But it rapidly gets complicated. KO'd. Or what if you wish to indicate a particular pronunciation: fo'c's'le. Or someone's name O'Doole.

What you might be able to get away with stripping a pair of quotes that start at the beginning of a word and the end of a word. It's clear he said, 'this isn't a contraction'. Matching the quote in front of this and the quote at the end of contraction is probably maybe safe.

# Use negative look behind and ahead to look for quotes which are
# not after and before a word character.
# Use a non-greedy match to catch multiple pairs of quotes.
re = /(?<!\w)'(.*?)'(?!\w)/
sentence.gsub(re, '\1')

This works in a lot of cases.

That's 'large' -> That's large
Eat at Joe's -> Eat at Joe's
I'll be Jane's -> I'll be Jane's
Jones' three cats' toys. -> Jones' three cats' toys.
It's clear he said, 'this isn't a contraction'. -> It's clear he said, this isn't a contraction.
'scare quotes' -> scare quotes
The 69'ers' drummer -> The 69'ers' drummer
Was She's success greater, or King Solomon's Mines's? -> Was She's success greater, or King Solomon's Mines's?
The 69'er's drummer and their 'contractual obligations'. -> The 69'er's drummer and their contractual obligations.
He said, 'it's clear this doesn't work'. -> He said, it's clear this doesn't work.

But not always.

His 'n' Hers's first track is called 'Joyriders'. -> His n Hers's first track is called Joyriders.

Like I said, this is one of those problems that looks simple but is extremely complicated and you can never get quite right. It can suck down a lot of time. I'd recommend ditching the requirement if possible.

Upvotes: 2

James Hibbard
James Hibbard

Reputation: 17735

A slight variation — if the single quotes only occur around word characters, that is a character from a-z, A-Z, 0-9 or the _ (underscore) character. you can use this:

phrase = "That's 'large' and not 'small', but it's still 'amazing'."
phrase.gsub(/'(\w*)'/, '\1')
=> "That's large and not small, but it's still amazing."

But as Schwern says, if you're trying to do anything other than a bit of simple text manipulation, you'll soon find yourself bogged down by edge cases.

Upvotes: 0

Related Questions