Eleno
Eleno

Reputation: 3016

Emacs Lisp: matching a repeated pattern in a compact manner?

Let's suppose I have an RGB string (format: #<2 hex digits><2 hex digits><2 hex digits>) like this:

"#00BBCC"

and I'd like to match and capture its <2 hex digits> elements in a more compact manner than by using the obvious:

"#\\([[:xdigit:]\\{2\\}]\\)\\([[:xdigit:]\\{2\\}]\\)\\([[:xdigit:]\\{2\\}]\\)"

I've tried:

"#\\([[:xdigit:]]\\{2\\}\\)\\{3\\}"

and:

"#\\(\\([[:xdigit:]]\\{2\\}\\)\\{3\\}\\)"

But the most they matched has been the first <2 hex digits> element.

Any idea? Thank you.

Upvotes: 6

Views: 1278

Answers (3)

Sean
Sean

Reputation: 29790

Several years after my original response, Emacs has a much nicer way to do this, with the pcase macro.

(defun match-hex-digits (str)
  (pcase str
    ((rx "#" (let r (= 2 xdigit)) (let g (= 2 xdigit)) (let b (= 2 xdigit)))
     (list r g b))))

Upvotes: 0

Thomas
Thomas

Reputation: 17422

If you want to capture R,G,B in different subgroups, so that you can extract them using (match-string group), you need to have three different parentheses groups in your regexp at some point.

\(...\)\(...\)\(...\)

Otherwise, if you use a repeat pattern such as

\(...\)\{3\}

you have only one group, and after the match it will only contain the value of the last match. So, say, if you have something along the lines of

\([[:xdigit:]]\{2\}\)\{3\}

it will match a string like "A0B1C2", but (match-string 1) will only contain the contents of the last match, i.e. "C2", because the regexp defines only one group.

Thus you basically have two options: use a compact regexp, such as your third one , but do some more substring processing to extract the hex number as Sean suggests, or use a more complex regexp, such as your first one, which lets you access the three sub-matches more conveniently.

If you're mostly worried about code readability, you could always do something like

(let ((hex2 "\\([[:xdigit:]]\\{2\\}\\)"))
  (concat "#" hex2 hex2 hex2))

to construct such a more complex regexp in a somewhat less redundant way, as per tripleee's suggestion.

Upvotes: 3

Sean
Sean

Reputation: 29790

You can make the regexp shorter at the expense of some extra code:

(defun match-hex-digits (str)
  (when (string-match "#[[:xdigit:]]\\{6\\}" str)
    (list (substring (match-string 0 str) 1 3)
          (substring (match-string 0 str) 3 5)
          (substring (match-string 0 str) 5 7))))

Upvotes: 6

Related Questions