Reputation: 69
I'll be glad to find some help with XML manipulation with R.
I'm trying to proceed XPath on my XML/TEI file. Here's its structure :
<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">
<text>
<body>
<div>
<p>
<seg>
<name ref="Actr1235">Jen B.</name>frate M. <name ref="Actr1234">Léard B.</name> rhoncus orci quis luctus ultrices <note place="margin-left">1713 & 1714</note>, a été
vehicula cursus nunc, at sagittis lorem aliquet sed <name ref="Actr1236"> Jaes L.</name>
aeman graeca <name type="place">Digo</name> iaculis volutpat risu <name ref="Cole14">la
Charias</name>. M. <name ref="Actr1236">Laure</name> bibendum augue erat, fermentum semper. M. <name ref="Actr1235">B.</name> bibendum augue erat, fermentum semper
</seg>
</p>
</div>
</body>
</text>
</TEI>
I'd like to extract all the attribute's values beginning with "Actr" inside <name>
tags.
I've tried this XPath in an XMLeditor //tei:name/@ref[starts-with(., 'Actr')]
and it's working.
Now, I'm trying to do it with R to put the query's results in a dataframe, using XML package to parse the document
library(XML)
data1715<-xmlParse("My_document.xml")
name_query<-xpathSApply(data1715, "data(//tei:name/@ref[starts-with(., 'Actr')])", xmlValue)
It returns following error:
XPath error : Undefined namespace prefix xmlXPathCompOpEval: parameter error XPath error : Invalid expression Erreur dans xpathApply.XMLInternalDocument(doc, path, fun, ..., namespaces = namespaces, : error evaluating xpath expression data(//tei:name/@ref[starts-with(., 'Actr')])
How do you define namespace in this case?
Upvotes: 1
Views: 2161
Reputation: 206207
The XML
package doesn't handle default namespaces very well. You need to be explicit about assigning a name to the namespace before you can use xpath style expressions. How about something like
xpathSApply(data1715,
"//tei:name/@ref[starts-with(.,'Actr')]",
unname,
namespaces=c(tei=getDefaultNamespace(data1715)[[1]]$uri))
Note I also removed data()
and changed xmlValue
. I'm not sure what you were trying to do with data()
, but here we are returning attributes and xmlValue
doesn't appear to like working with attributes.
Upvotes: 3