How to extract specific elements from XML node

Question

I have something like this:

Except they are a lot longer and I have about 300 sets of . How can I extract only the XValue and YValue elements of everything? I thought I can do xpathSApply('//ValuesPeaks[XValue]',xmlValue), but its not working. I then thought I can do toString.XMLNode() then use regexpr() and substr() to obtain what I want but that seems inefficient. I think I'm missing something. Please share your expertise. Thanks.

p<-list.files()[[1]]
library(XML)
x<-xmlParse(p)
getNodeSet(x,'//Data/RESULT/*/*/*/ValuesPeaks/Peak')
f<-xpathSApply(x,'//Data/RESULT/*/*/*/ValuesPeaks/Peak')
t<-toString.XMLNode(f)

Rich Scriven · Accepted Answer

There are a few ways to extract those attributes. It all depends on what you want the result to look like. Here are a couple of examples.

The first uses xmlAttrs() and subsets the results.

xpathApply(doc, "//ValuesPeaks//*", function(x) xmlAttrs(x)[c("XValue", "YValue")])
# [[1]]
#     XValue     YValue 
#      "149" "100.0000" 
#
# [[2]]
#    XValue    YValue 
#   "173.2" "96.2713"

The second is likely more efficient. It uses an XPath statement to get the two relevant attributes.

xpathSApply(doc, "//ValuesPeaks//@*[name()='XValue' or name()='YValue']")
#    XValue     YValue     XValue     YValue 
#     "149" "100.0000"    "173.2"  "96.2713"

You could even do

sapply(unname(xmlToList(doc)), "[", c("XValue", "YValue"))
#        [,1]       [,2]     
# XValue "149"      "173.2"  
# YValue "100.0000" "96.2713"

Data:

txt <- '
  
  
'
library(XML)
doc <- xmlParse(txt)

How to extract specific elements from XML node

Answers (2)

Related Questions