Reputation: 9466
My ultimate goal is to write a function in JavaScript which will escape all regex metacharacters in Erlang. Because I want to construct a Mango $regex
query for CouchDB 2 via my HTML5 application using PouchDB and pouchdb-find
. I want to perform a search for a substring in a field on the objects in my database, without going to the trouble of setting up couchdb-lucene
if I can help it and if that tool isn't needed.
In writing this escaping function, I found that Elixir has already written one.
{:ok, pattern} = :re.compile(~S"[.^$*+?()\[\]{}\\\|\s#-]", [:unicode])
@escape_pattern pattern
@spec escape(String.t) :: String.t
def escape(string) when is_binary(string) do
:re.replace(string, @escape_pattern, "\\\\&", [:global, {:return, :binary}])
end
I am trying to figure out how to translate this expression to JavaScript, and in that process, I am trying to understand Elixir's and Erlang's regular expression syntax, which I understand to be based off PCRE.
Escaping the [
and ]
characters makes enough sense, since they are inside a bracketed expression themselves. As does \
, since it's an escape character.
But why are \|
and \s
being escaped?
Upvotes: 0
Views: 1173
Reputation: 9466
As Lucas Trzesniewski and Dogbert have deduced in the comments, \|
does not need to be escaped, and \s
is escaped because if the Regex has the x
flag, any unescaped whitespace is ignored, so escaping the space will always have a valid regex not dependent on whether the x
flag is present or not: {"a b" =~ ~r/a b/, "a b" =~ ~r/a b/x, "a b" =~ ~r/a\ b/x} #=> {true, false, true}
Here's the escaping function I ended up with:
function escapeRegex (string) {
return string.replace(/([.^$*+?()\[\]{}\\\s#-])/g, '\\$&');
}
Upvotes: 0