Bithov Vinu
Bithov Vinu

Reputation: 59

Reading data stored in S-expressions into memory in another Common Lisp program

I have a Common Lisp program that reads data from a S-expression file. Originally the program was intended to be written in C with a CSV file that was parsed into a struct in-memory, but I switched to Common Lisp and S-expressions to remove abstractions between programmer and user. If I have a struct defined in program.lisp as such:

(defstruct flashcard
  front
  back
)

...and a file data.lisp:

(:virology
  (
     (:card
        :front "To what sort of cells does Epstain-Barr virus attach?"
        :back "B-cells"      
     )
     (:card
        :front "For how long does the virus of herpes simplex persist in tissues?"
        :back "lifetime"
     )
     (:card
        :front "What T-cell receptors are recognised by HIV?"
        :back CD4
     )
  )

...with several lists within it much like ":virology".

How would I read data.lisp into memory so that each individual card can be accessed by program.lisp.

For context I'm looking to iterate over each card in a queue based on filters supplied by the user (ie. they list :virology :biochemistry :homeostasis etc etc.), and quiz the user on the card. I can't see how this specific use case may affect how I would load the file into memory though.

(NOTE: As it currently stands, a simple struct with just two fields is more likely to be more suitable for a CSV file, however, as I develop the program I intend to add metadata, etc. so storing data in Lisp S-expressions that are then read by program.lisp is far more useful in the long run)

Upvotes: 1

Views: 772

Answers (2)

coredump
coredump

Reputation: 38789

Reading a file

Assuming you start your Lisp program (e.g. sbcl, ecl) in the directory where data.lisp is, then this should produce a tree of values (here the * is the prompt in the REPL, what follows is the value being read):

* (with-open-file (in "data.lisp")
    (read in))

(:VIROLOGY
 ((:CARD :FRONT "To what sort of cells does Epstain-Barr virus attach?" :BACK
   "B-cells")
  (:CARD :FRONT
   "For how long does the virus of herpes simplex persist in tissues?" :BACK
   "lifetime")
  (:CARD :FRONT "What T-cell receptors are recognised by HIV?" :BACK CD4)))

Typically you would wrap that in a function:

(defun cards (&optional (file "data.lisp"))
  (with-open-file (in file) (read in)))

This assumes that if you have multiple cards, you write them as follows:

(:virology (...) :biology (....) :physics (...))

If instead you write multiple lists, as follows:

(:virology (...))
(:biology (...))
(:physics (...))

Then the above cards needs to loop:

(defun cards (&optional (file "data.lisp"))
  (with-open-file (in file)
    (loop 
      for item = (read in nil in)
      until (eq item in)
      collect item)))

There is a trick above, because you want to collect values until there is no more value, but you don't want to throw an exception in case the file ends. That's why read takes nil as a second argument (no error), and the third argument is the value it should return in case it reaches end of file. Here the value is the stream object in itself: this is used to have a unique value that can not possibly be produced by read (unlike, say, nil). This is not really necessary in your case, you could use nil instead, but in general that's a good practice to follow.

Accessing cards

If you choose one way or another, you'll have to adapt of you query your values. Let's assume you use the loop above, so your data is a single list of entries, as collected by the loop:

((:virology (...))
 (:biology (...))
 (:physics (...)))

This is shaped like an association list, where each element is a cons-cell such that the car is a key and the cdr a value. If you call (assoc :virology (cards)), you'll have:

(:VIROLOGY
 ((:CARD :FRONT "To what sort of cells does Epstain-Barr virus attach?" :BACK
   "B-cells")
  (:CARD :FRONT
   "For how long does the virus of herpes simplex persist in tissues?" :BACK
   "lifetime")
  (:CARD :FRONT "What T-cell receptors are recognised by HIV?" :BACK CD4)))

The cdr of that is a list with a single value, namely a list of cards.

You could simplify your data format so that you use this format:

(:virology (:card ...) (:card ...) (:card ...)) 

Instead of:

(:virology ((:card ...) (:card ...) (:card ...)))

That removes one level of nesting, and instead of a list containing one list of cards, you can directly access the list of cards as the cdr of your entries. So let's assume you edit data.lisp to remove one level of nesting, then:

(defun find-cards (cards key)
  (cdr (assoc key cards)))

* (find-cards (cards) :virology)
((:CARD :FRONT "To what sort of cells does Epstain-Barr virus attach?" :BACK
  "B-cells")
 (:CARD :FRONT
  "For how long does the virus of herpes simplex persist in tissues?" :BACK
  "lifetime")
 (:CARD :FRONT "What T-cell receptors are recognised by HIV?" :BACK CD4))

So far, so, good.

To recap, we have an association list mapping keys to lists of cards.

If you ever wanted to use an hash-table instead, then you could populate one by yourself, or use a library like alexandria. In order to do so you should probably first setup Quicklisp:

* (ql:quickload :alexandria)
To load "alexandria":
  Load 1 ASDF system:
    alexandria
; Loading "alexandria"

(:ALEXANDRIA)

Then, you can call:

* (alexandria:alist-hash-table (cards))
#<HASH-TABLE :TEST EQL :COUNT 1 {1015268CE3}>

If you reach this step, you can have a look at how hash table works for example in chapter 11. Collections from Peter's Seibel Practical Common Lisp.

Making flashcard structures

In any case, whether you access the cards with find-cards or with GETHASH, you'll have a list of cards in a certain format. If you want to convert them as instances of structures, then you need first to define a way to convert a card from the list format to a structure.

Each card is stored in a list that starts with :card and whose rest is a property list of values. A property list is a succession of keys and values in a flat way:

(:a 0 :b 1 :c 2)

Fortunately, you can use DESTRUCTURING-BIND to match against a known format (for more complex formats, there are pattern-matching libraries):

(defun parse-card (list)
  ;; This is the expected format, the list starts with `:card`, so
  ;; I add an assertion here.
  (assert (eq :card (first list)))  
  ;; The rest of the list is a property list, let's bind front and back
  ;; to the values associated with keys :front and :back
  (destructuring-bind (&key front back) (rest list)
    ;; this is a function generated by "defstruct"
    (make-flashcard :front front :back back)))

For example:

* (parse-card '(:card :front 0 :back 1))
#S(FLASHCARD :FRONT 0 :BACK 1)

Once this work, you can use mapcar to convert a list of cards:

* (mapcar #'parse-card (find-cards (cards) :virology))
(#S(FLASHCARD
    :FRONT "To what sort of cells does Epstain-Barr virus attach?"
    :BACK "B-cells")
 #S(FLASHCARD
    :FRONT "For how long does the virus of herpes simplex persist in tissues?"
    :BACK "lifetime")
 #S(FLASHCARD :FRONT "What T-cell receptors are recognised by HIV?" :BACK CD4))

I know structures are easy to define, but they are not easy to change in a live system: if you want to add a new slot, then you need to restart your Lisp (this is intended to be compiled efficiently, like in statically typed languages, whereas classes are more dynamic).

Warning

When the data you read is a symbol, like CD4, it will belong to the package that is currently bound when calling read. This can pollute your packages and/or cause difficulties. You may prefer to use strings instead.

Conclusion

This is not a complete solution but you should have different tools now to progress, depending on where you want to go.

Upvotes: 8

Ehvince
Ehvince

Reputation: 18375

To read lisp forms from a file, see uiop:read-file-form[s]. The singular one reads 1 structure (like a giant alist), the plural one reads several structures in the file.

If you want to save the data from a program to a file, you can simply write them, for example with format <file> "~S" (~S keeps structure, not ~A), but with a couple precautions:

(let ((*print-pretty* nil)   ;; 
      (*print-length* nil))  ;; don't abbreviate long lines with "..."
  … write to file …)

UIOP also has read-file-line[s] and read-file-string.

Upvotes: 3

Related Questions