Reputation: 6231
Suppose I said £
character as dangerous, and I want to be able to protect and to unprotect any string. And vice versa.
Example 1:
"Foobar £ foobar foobar foobar." # => dangerous string
"Foobar \£ foobar foobar foobar." # => protected string
Example 2:
"Foobar £ foobar £££££££foobar foobar." # => dangerous string
"Foobar \£ foobar \£\£\£\£\£\£\£foobar foobar." # => protected string
Example 3:
"Foobar \£ foobar \\£££££££foobar foobar." # => dangerous string
"Foobar \£ foobar \\\£\£\£\£\£\£\£foobar foobar." # => protected string
Is there an easy way, with Ruby, to escape (and unescape) a given character (such as £
in my example) from a string?
Edit: here is an explication about the behavior of this question.
First of all, thanks for your answers. I have a Rails app with a Tweet
model having a content
field. Example of tweet:
tweet = Tweet.create(content: "Hello @bob")
Inside the model, there's a serialization process that converte the string like this:
dump('Hello @bob') # => '["Hello £", 42]'
# ... where 42 is the id of bob username
Then, I'm able to deserialize and display its tweet like this:
load('["Hello £", 42]') # => 'Hello @bob'
In the same way, it's also possible to do so with more than one username:
dump('Hello @bob and @joe!') # => '["Hello £ and £!", 42, 185]'
load('["Hello £ and £!", 42, 185]') # => 'Hello @bob and @joe!'
That's the goal :)
But this find-and-replace could be hard to perform with something like:
tweet = Tweet.create(content: "£ Hello @bob")
'cause here we also have to escape £
char. And I think your solution is good for this. So the result become:
dump('£ Hello @bob') # => '["\£ Hello £", 42]'
load('["\£ Hello £", 42]') # => '£ Hello @bob'
Just perfect. <3 <3
Now, if there is this:
tweet = Tweet.create(content: "\£ Hello @bob")
I think we first should escape every \
, and then escape every £
, like:
dump('\£ Hello @bob') # => '["\\£ Hello £", 42]'
load('["\\£ Hello £", 42]') # => '£ Hello @bob'
However... how can we do in this case:
tweet = Tweet.create(content: "\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\£ Hello @bob")
...where tweet.content.gsub(/(?<!\\)(?=(?:\\\\)*£)/, "\\")
seems not working.
Upvotes: 1
Views: 1664
Reputation: 34385
If you are using Ruby 1.9, which has lookbehind, then FailedDev's answer should work quite well. If you are using Ruby 1.8, which does not have lookbehind (I think), a different approach may work. Give this a try:
text.gsub!(/(\\.)|£)/m) do
if ($1 != nil) # If escaped anything
"$1" # replace with self.
else # Otherwise escape the
"\\£" # unescaped £.
end
end
Note that I am not a Ruby programmer and this snippet is untested (in particular I'm not sure if the: if ($1 != nil)
statement usage is correct - it may need to be: if ($1 != "")
or if ($1)
), but I do know that this general technique (using code in place of a simple replacement string) works. I recently used this same technique for my JavaScript solution to a similar question which was looking to find unescaped asterisks.
Upvotes: 1
Reputation: 26930
Hopefully your version of ruby supports lookbehinds. If it doesn't my solution will not work for you.
Escape characters :
str = str.gsub(/(?<!\\)(?=(?:\\\\)*£)/, "\\")
Un-escape characters :
str = str.gsub(/(?<!\\)((?:\\\\)*)\\£/, "\1£")
Both regexes will work regardless of the amount of backslashes. They are complementing each other.
Escape explanation :
"
(?<! # Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind)
\\ # Match the character “\” literally
)
(?= # Assert that the regex below can be matched, starting at this position (positive lookahead)
(?: # Match the regular expression below
\\ # Match the character “\” literally
\\ # Match the character “\” literally
)* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
£ # Match the character “£” literally
)
"
Not that I am matching a certain position. No text is consumed at all. When I pinpoint the position I want I insert a \.
Explanation of unescape :
"
(?<! # Assert that it is impossible to match the regex below with the match ending at this position (negative lookbehind)
\\ # Match the character “\” literally
)
( # Match the regular expression below and capture its match into backreference number 1
(?: # Match the regular expression below
\\ # Match the character “\” literally
\\ # Match the character “\” literally
)* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
)
\\ # Match the character “\” literally
£ # Match the character “£” literally
"
Here I am saving all the backslashes minus one and and I replace this number of backslashes with the special character. Tricky stuff :)
Upvotes: 2
Reputation: 143114
I'm not sure if this is what you want, but I think you can do a simple find-and-replace:
str = str.gsub("£", "\\£") # to escape
str = str.gsub("\\£", "£") # to unescape
Note that I changed \
to \\
because you have to escape the backslash in a double-quoted string.
Edit: I think what you want is a regex that matches an odd number of backslashes:
str = str.gsub(/(^|[^\\])((?:\\\\)*)\\£/, "\\1\\2£")
That does the following transformations
"£" #=> "£"
"\\£" #=> "£"
"\\\\£" #=> "\\\\£"
"\\\\\\£" #=> "\\\\£"
Upvotes: 0