Reputation: 36522
I want to replace all single non-whitespace characters from a string with a space.
I have tried this.
string = 'This is a test string'
string.gsub(/(\W|\A).(\W|\z)/, ' ')
=> "This is test string"
Which works great. But if I have two consecutive single characters, it only finds the first.
string = 'This is a x test string'
string.gsub(/(\W|\A).(\W|\z)/, ' ')
=> "This is x test string"
I am not sure which regex principle I am missing here that I need to make this work. Any ideas?
Upvotes: 1
Views: 1485
Reputation: 13921
And here is a non-regexp version:
string = 'This is x a test string'
single_character = -> x { x.size == 1 }
p string.split(' ').reject(&single_character).join(' ') #=> "This is test string"
Upvotes: 1
Reputation: 8332
If I understand you correctly, you want to remove single instances of non-whitespace. Try replacing
\s\S(?!\S)|(?<!\S)\S\s
with nothing - ""
.
Se an example here at regex101.
Upvotes: 0
Reputation:
Regex principle in use here is word boundary
.
Try with \b[A-Za-z]\b
Regex101 Demo
This will work most of the time except if there is some other character than word. Such as a@
then it will consider a
as single character because there is a word boundary between a
and @
like this a|@
.
In that case you can also go with look-around
assertions which will look for space
on both side of letter. So as to qualify as single character.
Regex: (?<=\s)[A-Za-z](?=\s)
Regex101 Demo
Update #1:
For non-whitespace character use \S
or [^\s]
in search pattern.
Regex will be (?<=\s)[^\s](?=\s)
Or (?<=\s)\S(?=\s)
Regex101 Demo
Update #2:
To match at beginning or at end of string, added ^
and $
into lookaround assertions.
Regex: (?<=^|\s)[^\s](?=\s|$)
Regex101 Demo
Note:- Use used \A
and \z
instead of ^
and $
if latter doesn't works.
Upvotes: 5
Reputation: 8769
You can use word boundary \b
like this:
string = 'This is a x y z test string'
string.gsub(/\b\w\b/, ' ').gsub(/\s{2,}/, ' ')
=> "This is test string"
Others characters can be used with char classes like this: [\w\-]
or a not space char like this: (?<=\s)\S(?=\s)
Upvotes: 0
Reputation: 2267
You can use positive lookahead (or lookbehind). Then the space before (or after with lookbehind) wont be included in the match, and you replace with the empty string.
string = 'This is a x test string'
string.gsub(/(?<=\W|\A).(\W|\z)/, '')
=> "This is test string"
I'd restrict the character matched in between to a \w
, and maybe move to unicode aware character classes.
Upvotes: 0