Reputation: 153531
For example, I have a string abcdefg. *
, how can I create a regexp [abcdefg\. *]
that can match each character in the string? The problem is that there could be special characters such as .
in the string.
Upvotes: 2
Views: 414
Reputation:
A simple and robust solution is to use the built-in regexp-opt
function, which takes a list of fixed strings and returns an efficient regex to match any one of them. Then all you need to do is split your original string into one-character segments:
(regexp-opt
(mapcar #'char-to-string
(string-to-list "abcdefg. *"))) ; => "[ *.a-g]"
Upvotes: 6
Reputation:
(defun partition (string test &rest more-tests)
(loop with hash = (make-hash-table)
for c across string do
(loop for f in (cons test more-tests)
for i from 1 do
(when (funcall f c)
(setf (gethash i hash) (cons c (gethash i hash)))
(return))
finally (setf (gethash 0 hash) (cons c (gethash 0 hash))))
finally (return (loop for v being the hash-values of hash
collect (coerce v 'string)))))
(defun regexp-quote-charclass (input)
(destructuring-bind (safe dangerous)
(partition input (lambda (x) (member x '(?\\ ?\] ?^ ?- ?:))))
(concat "[" (remove-duplicates safe)
(let ((dangerous (coerce (remove-duplicates dangerous) 'list))
(printed safe))
(with-output-to-string
(when (member ?\\ dangerous)
(setf printed t)
(princ "\\\\"))
(when (member ?: dangerous)
(setf printed t)
(princ "\\:"))
(when (member ?\] dangerous)
(setf printed t)
(princ "\\]"))
(when (member ?^ dangerous)
(if printed (princ "^") (princ "\\^")))
(when (member ?\- dangerous) (princ "-")))) "]")))
This seems like it would do the job. Also, to my best knowledge, you don't need to escape the characters which have meaning outside the character class, such as ?[
or ?$
etc. However, I've added ?:
because in a very rare case it could get confused to things like [:alpha:]
(you cannot obtain this exact string through this function, but I'm not sure of how Emacs will parse the [:
combination, so just to be sure.
Upvotes: 1
Reputation: 781769
Use the regexp-quote
function.
(setq regexp (concat "[" (regexp-quote string) "]"));
Note that most regexp characters don't have special meaning inside square brackets, so they don't need to be quoted. Here is the Emacs documentation on including certain special characters inside a character set:
Note that the usual regexp special characters are not special inside a character set. A completely different set of special characters exists inside character sets: ']', '-' and '^'.
To include a ']' in a character set, you must make it the first character. For example, '[]a]' matches ']' or 'a'. To include a '-', write '-' as the first or last character of the set, or put it after a range. Thus, '[]-]' matches both ']' and '-'.
To include '^' in a set, put it anywhere but at the beginning of the set. (At the beginning, it complements the set--see below.)
Upvotes: 4