Reputation: 675
Does anyone know how to convert following XML into R dataframe?
<?xml version="1.0"?>
<soap:Envelope>
<soap:Body>
<getCampaignsResponse>
<getCampaignsResult>
<campaign>
<categoryBids>
<categoryBid>
<campaignCategoryUID>1234</campaignCategoryUID>
<campaignID>1211</campaignID>
<categoryID>1254</categoryID>
<selected>true</selected>
<bidInformation>
<biddingStrategy>Cpc</biddingStrategy>
<cpcBid>
<cpc>0.5</cpc>
</cpcBid>
<cpaBid xsi:nil="true"/>
</bidInformation>
</categoryBid>
<categoryBid>
<campaignCategoryUID>5487</campaignCategoryUID>
<campaignID>3244</campaignID>
<categoryID>1234</categoryID>
<selected>true</selected>
<bidInformation>
<biddingStrategy>Cpc</biddingStrategy>
<cpcBid>
<cpc>0.2</cpc>
</cpcBid>
<cpaBid xsi:nil="true"/>
</bidInformation>
</categoryBid>
</categoryBids>
</campaign>
</getCampaignsResult>
</getCampaignsResponse>
</soap:Body>
</soap:Envelope>
The class of the XML Object is:
> str(data)
Classes 'XMLInternalDocument', 'XMLAbstractDocument' <externalptr>
The dataframe should have following columns:
campaignCategoryUID
campaignID
categoryID
biddingStrategy
cpc
With xmlToDataFrame
or xmlToList
I couldn´t achieve useful results. Any help is really appreciated!
Upvotes: 0
Views: 618
Reputation: 78852
You have to extract the nodes by hand with something like xpathSApply
and probably need to change the way you parse the response since it doesn't have any namespace definitions:
library(XML)
xml <- '<?xml version="1.0"?>
<soap:Envelope>
<soap:Body>
<getCampaignsResponse>
<getCampaignsResult>
<campaign>
<categoryBids>
<categoryBid>
<campaignCategoryUID>1234</campaignCategoryUID>
<campaignID>1211</campaignID>
<categoryID>1254</categoryID>
<selected>true</selected>
<bidInformation>
<biddingStrategy>Cpc</biddingStrategy>
<cpcBid>
<cpc>0.5</cpc>
</cpcBid>
<cpaBid xsi:nil="true"/>
</bidInformation>
</categoryBid>
<categoryBid>
<campaignCategoryUID>5487</campaignCategoryUID>
<campaignID>3244</campaignID>
<categoryID>1234</categoryID>
<selected>true</selected>
<bidInformation>
<biddingStrategy>Cpc</biddingStrategy>
<cpcBid>
<cpc>0.2</cpc>
</cpcBid>
<cpaBid xsi:nil="true"/>
</bidInformation>
</categoryBid>
</categoryBids>
</campaign>
</getCampaignsResult>
</getCampaignsResponse>
</soap:Body>
</soap:Envelope>'
doc <- xmlRoot(xmlTreeParse(xml, useInternalNodes = TRUE))
data <- data.frame(campaignCategoryUID=xpathSApply(doc, "//campaignCategoryUID", xmlValue),
campaignID=xpathSApply(doc, "//campaignID", xmlValue),
categoryID=xpathSApply(doc, "//categoryID", xmlValue),
biddingStrategy=xpathSApply(doc, "//biddingStrategy", xmlValue),
cpc=xpathSApply(doc, "//cpc", xmlValue))
data
## campaignCategoryUID campaignID categoryID biddingStrategy cpc
## 1 1234 1211 1254 Cpc 0.5
## 2 5487 3244 1234 Cpc 0.2
You can also do the extraction functionally:
nodes <- c("campaignCategoryUID", "campaignID", "categoryID", "biddingStrategy", "cpc")
data <- rbind.data.frame(sapply(nodes, function(x) xpathSApply(doc, sprintf("//%s", x), xmlValue)))
provided you don't need to deal with edge cases (i.e. provided all the extractions are uniform and won't have "errors").
Upvotes: 1