Reputation: 33365
Lisp s-expressions are a concise and flexible way to represent code as an abstract syntax tree. Relative to the more specialized data structures used by compilers for other languages, however, they have one drawback: it is difficult to keep track of the file and line number corresponding to any particular point in the code. At least some Lisps end up just punting the problem; in the event of an error, they report source location only as far as function name, not file and line number.
Some dialects of Scheme have solved the problem by representing code not with ordinary cons cells, but with syntax objects, which are isomorphic to cons cells but can also carry additional information such as source location.
Has any implementation of Common Lisp solved this problem? If so, how?
Upvotes: 5
Views: 376
Reputation: 139251
The Common Lisp standard says very little about these things. It mentions for example that the function ed
may take a function name and then open the editor with respective source code. But there is no mechanism specified and this feature is entirely provided by the development environment, possibly in combination with the Lisp system.
A typical way to deal with that is to compile a file and the compiler will record the source location of the object defined (a function, a variable, a class, ...). The source location could for example be placed on the property list of the symbol (the name of the thing defined), or recorded in some other place. Also the actual source code as a list structure can be associated with a Lisp symbol. See the function FUNCTION-LAMBDA-EXPRESSION
.
Some implementations do more sophisticated source location recording. For example LispWorks can locate a specific part of a function which is currently executed. It also notes when the definition comes from an editor or a Listener. See Dspecs: Tools for Handling Definitions. The debugger then can for example locate where the code of a certain stack frame is located in the source.
SBCL also has a feature to locate source code.
Notice also that the actual 'source code' in Common Lisp is not always a text a file, but the read s-expression. eval
and compile
- two standard functions - don't take strings or filenames as arguments. They use the actual expressions:
CL-USER 26 > (compile 'foo (lambda (x) (1+ x)))
FOO
NIL
NIL
CL-USER 27 > (foo 41)
42
S-expressions as code are not bound to any particular textual formatting. They can be reformatted by the pretty printer function pprint
and this may take available width into account to generate a layout.
So, noting the structure maybe be useful and it would be less useful to record source lines.
Upvotes: 3
Reputation: 38789
My understanding is that whatever data Scheme stores in the AST is data that can be associated to expressions in a CL environment.
(defun my-simple-scheme-reader (stream)
(let ((char (read-char stream)))
(or (position char "0123456789")
(and (member char '(#\newline #\space #\tab)) :space)
(case char
(#\) :closing-paren)
(#\( (loop
with beg = (file-position stream)
for x = (my-simple-scheme-reader stream)
until (eq x :closing-paren)
unless (eq x :space)
collect x into items
finally (return (list :beg beg
:end (file-position stream)
:items items))))))))
For example:
(with-input-from-string (in "(0(1 2 3) 4 5 (6 7))")
(my-simple-scheme-reader in))
returns:
(:BEG 1 :END 20 :ITEMS
(0 (:BEG 3 :END 9 :ITEMS (1 2 3)) 4 5 (:BEG 15 :END 19 :ITEMS (6 7))))
The enriched tree represents syntax objects.
(defun make-environment ()
(make-hash-table :test #'eq))
(defun my-simple-lisp-reader (stream environment)
(let ((char (read-char stream)))
(or (position char "0123456789")
(and (member char '(#\newline #\space #\tab)) :space)
(case char
(#\) :closing-paren)
(#\( (loop
with beg = (file-position stream)
for x = (my-simple-lisp-reader stream environment)
until (eq x :closing-paren)
unless (eq x :space)
collect x into items
finally
(setf (gethash items environment)
(list :beg beg :end (file-position stream)))
(return items)))))))
Test:
(let ((env (make-environment)))
(with-input-from-string (in "(0(1 2 3) 4 5 (6 7))")
(values
(my-simple-lisp-reader in env)
env)))
Returns two values:
(0 (1 2 3) 4 5 (6 7))
#<HASH-TABLE :TEST EQL :COUNT 3 {1010524CD3}>
Given a cons cell, you can track back its original position. You can add more precise information if you want to. Once you evaluate a defun
, for example, the source information can be attached to the function object, or as a symbol property, which means the information is garbage collected on redefinitions.
Note that in both cases there is no source file to keep track of, unless the system is able to track back to the original string in the source file where the reader is called.
Upvotes: 2