Reputation: 4771
I am trying to learn me some Haskell and I wanted to parse some XML files with the following structure:
<properties>
<property name="a">
<value>1</value>
</property>
<property name="b">
<value>2</value>
</property>
</properties>
Following the example from the wiki I can search for all properties by
runX (readDocument [ withValidate no] "my.xml"
>>> deep (isElem >>> hasName "properties")
but how can I extract only the value element of property name="b"
?
Upvotes: 2
Views: 565
Reputation: 4771
TagSoup indeed did the thing for me. Based on the tutorial I found
module Main where
import Text.HTML.TagSoup
searchXML :: IO ()
searchXML = do
rsp <- readFile "test.xml"
let tags = parseTags rsp
let links = extr "value" [] $
extr "property" [("name","b")] tags
let value = fromTagText $ links !! 0
putStr value
where
extr a b c = drop 1 $ takeWhile (~/= TagClose a) $
dropWhile (~/= TagOpen a b) c
main = searchXML
which prints just the value 2
. But I am pretty sure the code can be simplified a lot.
Upvotes: 1
Reputation: 568
To be honest, I find that HXT
is a quite complex library to use.
My understanding so far is that you transform a document in another one using arrows chain.
If you want to learn arrows, you may find that my solution is cheating, but for me it did the job :
I just use XPath
(cabal install hxt-xpath
) and produce an output document.
import Text.XML.HXT.Core
import Text.XML.HXT.XPath.Arrows
main :: IO ()
main = do
runX $ readDocument [] "my.xml"
>>>
root [] [ selem "values" [getXPathTrees "/properties/property[@name=\"b\"]/value"]]
>>>
writeDocument [withIndent yes] "out.xml"
return ()
yielding
<?xml version="1.0" encoding="UTF-8"?>
<values>
<value>2</value>
</values>
Upvotes: 3