Sebastian_學生
Sebastian_學生

Reputation: 345

Parsing XML with emacs elisp and finding a nested attribute

I have some xml which looks like this:

<grammar>
      <l>
    <f form="paradāra"><s stem=""/><m meaning="anothers wife; adultery"/></f>
    <f form="abhimarśeṣu"><s stem="" meaning=""/><m meaning=""/></f>
    <f form="pravṛttān"><s stem="" meaning=""/><m meaning=""/></f>
    <f form="mahipatis"><s stem="" meaning=""/><m meaning=""/></f>
      </l>
      <l>
    <f form="udvejana"><s stem="udvejana" meaning="agitation, fear"/><m meaning=""/></f>
    <f form="karais"><na><ins/><pl/><mas/></na><s stem="kara#1" meaning="action"/><m meaning="by action"/></f>
    <f form="daṇḍais"><na><ins/><pl/><mas/></na><na><ins/><pl/><neu/></na><s stem="daṇḍa" meaning="punishment"/><m meaning="by punishment"/></f>
    <f form="cihnayitvā"><s stem="" meaning="having marked"/><m meaning=""/></f>
    <f form="pravāsayet"><v><cj><ca/></cj><sys><prs><md><op/></md><para/></prs></sys><np><sg/><trd/></np></v><s stem="pravas"/><m meaning="to put on, dress"/></f>
      </l>
    </grammar>

Now I convert this into S-expressions by running (xml-parse-region). It returns something like this:

((grammar nil "
" (l nil "
" (f ((form . "paradāra")) (s ((stem . ""))) (m ((meaning . "anothers wife; adultery")))) "

" (f ((form . "abhimarśeṣu")) (s ((stem . "") (meaning . ""))) (m ((meaning . "")))) "
" (f ((form . "pravṛttān")) (s ((stem . "") (meaning . ""))) (m ((meaning . "")))) "
" (f ((form . "mahipatis")) (s ((stem . "") (meaning . ""))) (m ((meaning . "")))) "
") "
" (l nil "
" (f ((form . "udvejana")) (s ((stem . "udvejana") (meaning . "agitation, fear"))) (m ((meaning . "")))) "

" (f ((form . "karais")) (na nil (ins nil) (pl nil) (mas nil)) (s ((stem . "kara#1") (meaning . "action"))) (m ((meaning . "by action")))) "

" (f ((form . "daṇḍais")) (na nil (ins nil) (pl nil) (mas nil)) (na nil (ins nil) (pl nil) (neu nil)) (s ((stem . "daṇḍa") (meaning . "punishment"))) (m ((meaning . "by punishment")))) "

" (f ((form . "cihnayitvā")) (s ((stem . "") (meaning . "having marked"))) (m ((meaning . "")))) "

" (f ((form . "pravāsayet")) (v nil (cj nil (ca nil)) (sys nil (prs nil (md nil (op nil)) (para nil))) (np nil (sg nil) (trd nil))) (s ((stem . "pravas"))) (m ((meaning . "to put on, dress")))) "

") "
"))

What I want to do now is extract all the subnodes which start with (s ... ) and collect them in a separate buffer. like:

(s ((stem . "udvejana") (meaning . "agitation, fear")))

What would the code look like? Recursive walk the tree? Yesterday I got as far as being able to walk the first (l ... ) node, but due to a blackout I lost the code. Hope somebody of you has some suggestions!

Upvotes: 2

Views: 388

Answers (1)

abo-abo
abo-abo

Reputation: 20352

You just need basic recursion:

(defun rec-filter (predicate seq &optional acc)
  (cond ((null seq)
         acc)
        ((consp seq)
         (append (rec-filter predicate (car seq) nil)
                 (rec-filter predicate (cdr seq) nil)
                 (if (funcall predicate seq)
                     (cons seq acc)
                   acc)))
        (t
         acc)))

(rec-filter
 (lambda (x) (eq (car x) 's))
 tree)
;; =>
;; ((s ((stem . "")))
;;  (s ((stem . "")
;;      (meaning . "")))
;;  (s ((stem . "")
;;      (meaning . "")))
;;  (s ((stem . "")
;;      (meaning . "")))
;;  (s ((stem . "udvejana")
;;      (meaning . "agitation, fear")))
;;  (s ((stem . "kara#1")
;;      (meaning . "action")))
;;  (s ((stem . "daṇḍa")
;;      (meaning . "punishment")))
;;  (s ((stem . "")
;;      (meaning . "having marked")))
;;  (s ((stem . "pravas"))))

Upvotes: 2

Related Questions