Reputation: 59
I have a Common Lisp program that reads data from a S-expression file. Originally the program was intended to be written in C with a CSV file that was parsed into a struct in-memory, but I switched to Common Lisp and S-expressions to remove abstractions between programmer and user. If I have a struct defined in program.lisp
as such:
(defstruct flashcard
front
back
)
...and a file data.lisp
:
(:virology
(
(:card
:front "To what sort of cells does Epstain-Barr virus attach?"
:back "B-cells"
)
(:card
:front "For how long does the virus of herpes simplex persist in tissues?"
:back "lifetime"
)
(:card
:front "What T-cell receptors are recognised by HIV?"
:back CD4
)
)
...with several lists within it much like ":virology".
How would I read data.lisp
into memory so that each individual card can be accessed
by program.lisp
.
For context I'm looking to iterate over each card in a queue based on filters supplied by the user (ie. they list :virology :biochemistry :homeostasis etc etc.), and quiz the user on the card. I can't see how this specific use case may affect how I would load the file into memory though.
(NOTE: As it currently stands, a simple struct with just two fields is more likely to be more suitable for a CSV file, however, as I develop the program I intend to add metadata, etc. so storing data in Lisp S-expressions that are then read by program.lisp
is far more useful in the long run)
Upvotes: 1
Views: 772
Reputation: 38789
Assuming you start your Lisp program (e.g. sbcl, ecl) in the directory where data.lisp
is, then this should produce a tree of values (here the *
is the prompt in the REPL, what follows is the value being read):
* (with-open-file (in "data.lisp")
(read in))
(:VIROLOGY
((:CARD :FRONT "To what sort of cells does Epstain-Barr virus attach?" :BACK
"B-cells")
(:CARD :FRONT
"For how long does the virus of herpes simplex persist in tissues?" :BACK
"lifetime")
(:CARD :FRONT "What T-cell receptors are recognised by HIV?" :BACK CD4)))
Typically you would wrap that in a function:
(defun cards (&optional (file "data.lisp"))
(with-open-file (in file) (read in)))
This assumes that if you have multiple cards, you write them as follows:
(:virology (...) :biology (....) :physics (...))
If instead you write multiple lists, as follows:
(:virology (...))
(:biology (...))
(:physics (...))
Then the above cards
needs to loop:
(defun cards (&optional (file "data.lisp"))
(with-open-file (in file)
(loop
for item = (read in nil in)
until (eq item in)
collect item)))
There is a trick above, because you want to collect values until there is no more value, but you don't want to throw an exception in case the file ends. That's why read
takes nil
as a second argument (no error), and the third argument is the value it should return in case it reaches end of file. Here the value is the stream object in
itself: this is used to have a unique value that can not possibly be produced by read
(unlike, say, nil
). This is not really necessary in your case, you could use nil
instead, but in general that's a good practice to follow.
If you choose one way or another, you'll have to adapt of you query your values. Let's assume you use the loop
above, so your data is a single list of entries, as collected by the loop:
((:virology (...))
(:biology (...))
(:physics (...)))
This is shaped like an association list, where each element is a cons-cell such that the car
is a key and the cdr
a value. If you call (assoc :virology (cards))
, you'll have:
(:VIROLOGY
((:CARD :FRONT "To what sort of cells does Epstain-Barr virus attach?" :BACK
"B-cells")
(:CARD :FRONT
"For how long does the virus of herpes simplex persist in tissues?" :BACK
"lifetime")
(:CARD :FRONT "What T-cell receptors are recognised by HIV?" :BACK CD4)))
The cdr
of that is a list with a single value, namely a list of cards.
You could simplify your data format so that you use this format:
(:virology (:card ...) (:card ...) (:card ...))
Instead of:
(:virology ((:card ...) (:card ...) (:card ...)))
That removes one level of nesting, and instead of a list containing one list of cards, you can directly access the list of cards as the cdr
of your entries. So let's assume you edit data.lisp
to remove one level of nesting, then:
(defun find-cards (cards key)
(cdr (assoc key cards)))
* (find-cards (cards) :virology)
((:CARD :FRONT "To what sort of cells does Epstain-Barr virus attach?" :BACK
"B-cells")
(:CARD :FRONT
"For how long does the virus of herpes simplex persist in tissues?" :BACK
"lifetime")
(:CARD :FRONT "What T-cell receptors are recognised by HIV?" :BACK CD4))
So far, so, good.
To recap, we have an association list mapping keys to lists of cards.
If you ever wanted to use an hash-table instead, then you could populate one by yourself, or use a library like alexandria. In order to do so you should probably first setup Quicklisp:
* (ql:quickload :alexandria)
To load "alexandria":
Load 1 ASDF system:
alexandria
; Loading "alexandria"
(:ALEXANDRIA)
Then, you can call:
* (alexandria:alist-hash-table (cards))
#<HASH-TABLE :TEST EQL :COUNT 1 {1015268CE3}>
If you reach this step, you can have a look at how hash table works for example in chapter 11. Collections from Peter's Seibel Practical Common Lisp.
In any case, whether you access the cards with find-cards
or with GETHASH
, you'll have a list of cards in a certain format. If you want to convert them as instances of structures, then you need first to define a way to convert a card from the list format to a structure.
Each card is stored in a list that starts with :card
and whose rest is a property list of values. A property list is a succession of keys and values in a flat way:
(:a 0 :b 1 :c 2)
Fortunately, you can use DESTRUCTURING-BIND
to match against a known format (for more complex formats, there are pattern-matching libraries):
(defun parse-card (list)
;; This is the expected format, the list starts with `:card`, so
;; I add an assertion here.
(assert (eq :card (first list)))
;; The rest of the list is a property list, let's bind front and back
;; to the values associated with keys :front and :back
(destructuring-bind (&key front back) (rest list)
;; this is a function generated by "defstruct"
(make-flashcard :front front :back back)))
For example:
* (parse-card '(:card :front 0 :back 1))
#S(FLASHCARD :FRONT 0 :BACK 1)
Once this work, you can use mapcar to convert a list of cards:
* (mapcar #'parse-card (find-cards (cards) :virology))
(#S(FLASHCARD
:FRONT "To what sort of cells does Epstain-Barr virus attach?"
:BACK "B-cells")
#S(FLASHCARD
:FRONT "For how long does the virus of herpes simplex persist in tissues?"
:BACK "lifetime")
#S(FLASHCARD :FRONT "What T-cell receptors are recognised by HIV?" :BACK CD4))
I know structures are easy to define, but they are not easy to change in a live system: if you want to add a new slot, then you need to restart your Lisp (this is intended to be compiled efficiently, like in statically typed languages, whereas classes are more dynamic).
When the data you read is a symbol, like CD4
, it will belong to the package that is currently bound when calling read
. This can pollute your packages and/or cause difficulties. You may prefer to use strings instead.
This is not a complete solution but you should have different tools now to progress, depending on where you want to go.
Upvotes: 8
Reputation: 18375
To read lisp forms from a file, see uiop:read-file-form[s]
. The singular one reads 1 structure (like a giant alist), the plural one reads several structures in the file.
If you want to save the data from a program to a file, you can simply write them, for example with format <file> "~S"
(~S
keeps structure, not ~A
), but with a couple precautions:
(let ((*print-pretty* nil) ;;
(*print-length* nil)) ;; don't abbreviate long lines with "..."
… write to file …)
UIOP also has read-file-line[s]
and read-file-string
.
Upvotes: 3