Reputation: 5882
I have the following regex expression from Amazon Web Services (AWS) which is required for the Instance Name:
^([\p{L}\p{Z}\p{N}_.:/=+-@]*)$
However, I am unsure a more efficient way to find characters that do not match this string and replace them with just a simple space character.
For example, the string Hello (World)
should be replaced to Hello World
(the parentheses have been replaced with a space). This is just one of numerous examples of a character that does not match this string.
The only way I've been able to do this is by using the following code:
first_test_string.split('').each do |char|
if char[/^([\p{L}\p{Z}\p{N}_.:\/=+-@]*)$/] == nil
second_test_string = second_test_string.gsub(char, " ")
end
end
When using this code, I get the following result:
irb(main):037:0> first_test_string = "Hello (World)"
=> "Hello (World)"
irb(main):038:0> second_test_string = first_test_string
=> "Hello (World)"
irb(main):039:0>
irb(main):040:0> first_test_string.split('').each do |char|
irb(main):041:1* if char[/^([\p{L}\p{Z}\p{N}_.:\/=+-@]*)$/] == nil
irb(main):042:2> second_test_string = second_test_string.gsub(char, " ")
irb(main):043:2> end
irb(main):044:1> end
=> ["H", "e", "l", "l", "o", " ", "(", "W", "o", "r", "l", "d", ")"]
irb(main):045:0> first_test_string
=> "Hello (World)"
irb(main):046:0> second_test_string
=> "Hello World "
irb(main):047:0>
Is there another way to do this, one that less hacky? I was hoping for a solution where I could just provide a regex string and then simply look for everything but the characters that match the regex string.
Upvotes: 0
Views: 28
Reputation: 165198
Use String#gsub and negate the character class of acceptable characters with [^...]
.
2.6.5 :014 > "Hello (World)".gsub(%r{[^\p{L}\p{Z}\p{N}_.:/=+\-@]}, " ")
=> "Hello World "
Note I've also escaped -
as [+-@]
may be interpreted as the range of characters between +
and @
. For example, ,
lies between +
and @
.
2.6.5 :004 > "Hello, World".gsub(%r{[^\p{L}\p{Z}\p{N}_.:/=+-@]+}, " ")
=> "Hello, World"
2.6.5 :005 > "Hello, World".gsub(%r{[^\p{L}\p{Z}\p{N}_.:/=+\-@]+}, " ")
=> "Hello World"
Add a +
if you want multiple consecutive invalid characters to be replaced with a single space.
2.6.5 :024 > "((Hello~(World)))".gsub(%r{[^\p{L}\p{Z}\p{N}_.:/=+\-@]}, " ")
=> " Hello World "
2.6.5 :025 > "((Hello~(World)))".gsub(%r{[^\p{L}\p{Z}\p{N}_.:/=+\-@]+}, " ")
=> " Hello World "
Upvotes: 1