RNA
RNA

Reputation: 153531

how to programmatically create a regexp to match all the single char in a given string with elisp?

For example, I have a string abcdefg. * , how can I create a regexp [abcdefg\. *] that can match each character in the string? The problem is that there could be special characters such as . in the string.

Upvotes: 2

Views: 414

Answers (3)

user725091
user725091

Reputation:

A simple and robust solution is to use the built-in regexp-opt function, which takes a list of fixed strings and returns an efficient regex to match any one of them. Then all you need to do is split your original string into one-character segments:

(regexp-opt
 (mapcar #'char-to-string
         (string-to-list "abcdefg. *"))) ; => "[ *.a-g]"

Upvotes: 6

user797257
user797257

Reputation:

(defun partition (string test &rest more-tests)
  (loop with hash = (make-hash-table)
        for c across string do
        (loop for f in (cons test more-tests)
              for i from 1 do
              (when (funcall f c)
                (setf (gethash i hash) (cons c (gethash i hash)))
                (return))
              finally (setf (gethash 0 hash) (cons c (gethash 0 hash))))
        finally (return (loop for v being the hash-values of hash
                              collect (coerce v 'string)))))

(defun regexp-quote-charclass (input)
  (destructuring-bind (safe dangerous)
      (partition input (lambda (x) (member x '(?\\ ?\] ?^ ?- ?:))))
    (concat "[" (remove-duplicates safe)
            (let ((dangerous (coerce (remove-duplicates dangerous) 'list))
                  (printed safe))
              (with-output-to-string
                (when (member ?\\ dangerous)
                  (setf printed t)
                  (princ "\\\\"))
                (when (member ?: dangerous)
                  (setf printed t)
                  (princ "\\:"))
                (when (member ?\] dangerous)
                  (setf printed t)
                  (princ "\\]"))
                (when (member ?^ dangerous)
                  (if printed (princ "^") (princ "\\^")))
                (when (member ?\- dangerous) (princ "-")))) "]")))

This seems like it would do the job. Also, to my best knowledge, you don't need to escape the characters which have meaning outside the character class, such as ?[ or ?$ etc. However, I've added ?: because in a very rare case it could get confused to things like [:alpha:] (you cannot obtain this exact string through this function, but I'm not sure of how Emacs will parse the [: combination, so just to be sure.

Upvotes: 1

Barmar
Barmar

Reputation: 781769

Use the regexp-quote function.

(setq regexp (concat "[" (regexp-quote string) "]"));

Note that most regexp characters don't have special meaning inside square brackets, so they don't need to be quoted. Here is the Emacs documentation on including certain special characters inside a character set:

Note that the usual regexp special characters are not special inside a character set. A completely different set of special characters exists inside character sets: ']', '-' and '^'.

To include a ']' in a character set, you must make it the first character. For example, '[]a]' matches ']' or 'a'. To include a '-', write '-' as the first or last character of the set, or put it after a range. Thus, '[]-]' matches both ']' and '-'.

To include '^' in a set, put it anywhere but at the beginning of the set. (At the beginning, it complements the set--see below.)

Upvotes: 4

Related Questions