RandomBits
RandomBits

Reputation: 4292

Using syntax-table information in elisp for tokenization

I would like to use elisp to tokenize the following:

variable := "The symbol \" delimits strings"; (* Comments go here *)

as:

<variable> <:=> <The symbol \" delimits strings> <;>

based on the information from the buffer's syntax-table.

I have the symbol-table setup appropriately and am currently using the following function which operates correctly except for the string constant (either returns token or nil if point is not at an identifier or one of the operators in the regex).

(defun forward-token ()
  (forward-comment (point-max))
  (cond
   ((looking-at  (regexp-opt '("=" ":=" "," ";")))
    (goto-char (match-end 0))
    (match-string-no-properties 0))
   (t (buffer-substring-no-properties
       (point)
       (progn (skip-syntax-forward "w_")
              (point))))))

I am an elisp novice, so any pointers are appreciated.

Upvotes: 1

Views: 237

Answers (1)

sds
sds

Reputation: 60054

I don't think your skip-syntax-forward use is correct for strings. I think you need to add a cond clause like this:

((looking-at "\"")
 (let* ((here (point)) (there (scan-sexps here 1)))
   (goto-char there)
   (buffer-substring-no-properties
    (1+ here) (1- there))))

to handle string literals.

Upvotes: 1

Related Questions