Reputation: 3685
I'm looking for a regex to match:
ciao: c'iao 'ciao'
with:
ciao #every word excluding non-word character
c'iao #including apostrophes
ciao #excluding the quotes ''
So far I've been able to match the first 2 requirements with:
/[\w']+/
but I'm struggling with extracting word between single quotes (w/o including the quotes). Note that I won't have a case where a word with apostrophe is included between quotes (like 'c'iao')
I've seen many similar Q&A but couldn't find any suiting my needs; Extra points for an answer that includes a brief explanation :)
Upvotes: 3
Views: 1717
Reputation: 110685
Considering that words can begin or end with an apostrophe, or contain multiple apostrophes, I suggest first splitting on whitespace then removing pairs of single quotes that enclose words.
str = "'Twas because Bo didn't like Bess' or y'all's 'attitude'"
str.split.map { |s| s =~ /\A'.+'\z/ ? s[1..-2] : s }
#=> ["'Twas", "because", "Bo", "didn't", "like", "Bess'", "or", "y'all's", "attitude"]
The first step produces
arr = str.split
#=> ["'Twas", "because", "Bo", "didn't", "like", "Bess'", "or", "y'all's", "'attitude'"]
The regex matches elements of arr
that begin and end with a single quote.
Upvotes: 0
Reputation: 626851
You can use the following expression:
/\w+(?:'\w+)*/
See the Rubular demo
The expression matches:
\w+
- 1 or more word chars(?:'\w+)*
- zero or more sequences (as (?:...)*
is a non-capturing group that groups a sequence of subpatterns quantified with *
quantifier matching 0 or more occurrences) of:
'
- apostrophe\w+
- 1 or more word chars.See a short Ruby demo here:
"ciao: c'iao 'ciao'".scan(/\w+(?:'\w+)*/)
# => [ciao, c'iao, ciao]
Upvotes: 5