Nancy
Nancy

Reputation: 4109

Counting XML nodes in R

I'm trying to parse this XML table, but I'm having trouble counting the number of "var" nodes. My code so far is below. I would like to be able to replace the 16597 with a generalizable value so that I can use this code for other similar tables. I need to do this in R, not in XPATH.

require(RCurl)
require(XML)
url = "http://api.census.gov/data/2000/sf3/variables.xml"
doc = xmlParse(url)
root = xmlRoot(doc)
xml.data = xmlToList(doc)

id = NULL
label = NULL
concept = NULL
for(i in 1:16597){
  id[i] = xml.data[[1]][[(i+2)]][["id"]]
  label[i] = xml.data[[1]][[(i+2)]][["label"]]
  concept[i] = xml.data[[1]][[(i+2)]][["concept"]]
}

scraped.data = data.frame(id, label, concept)

I tried this based off of this question but got 0.

doc <- xmlTreeParse(url)
xpathApply(xmlRoot(doc),path="count(//vars)",xmlValue)

Where is my misunderstanding?

Upvotes: 0

Views: 2118

Answers (1)

Chris S.
Chris S.

Reputation: 2225

You can avoid the loop and just "rbind" your list.

y <- ldply(xml.data[[1]], "rbind")
dim(y)
[1] 16599     6
head(y)
  .id        id                                                                                                                                  label
1 var       for                                                                                                           Census API FIPS 'for' clause
2 var        in                                                                                                            Census API FIPS 'in' clause
3 var PCT022034               Total:  Not living in an MSA/PMSA in 2000:  Different house in 1995:  In United States in 1995:  In an MSA/PMSA in 1995:
4 var PCT022035 Total:  Not living in an MSA/PMSA in 2000:  Different house in 1995:  In United States in 1995:  In an MSA/PMSA in 1995:  Central city
5 var PCT022032                                                                   Total:  Not living in an MSA/PMSA in 2000:  Different house in 1995:
6 var PCT022033                                        Total:  Not living in an MSA/PMSA in 2000:  Different house in 1995:  In United States in 1995:

Upvotes: 1

Related Questions