Reputation: 413
I am having a hard time extracting values from an XML file. I would like to take the average of each "length" (value between the <length>
tags) for each hour of the day. In this XML file, all of the data comes from the same day: 2013-11-28
An example is shown below:
<root>
<item>
<time>2013-11-28T00:00:00-05:00</time>
<day>2013-11-28</day>
<length>150</length>
</item>
<item>
<time>2010-11-28T00:15:00-05:00</time>
<day>2010-11-28</day>
<length>200</length>
</item>
<item>
<time>2010-11-28T00:30:00-05:00</time>
<day>2010-11-28</day>
<length>127.83</length>
</item>
</root>
I would like the output to look something like this:
hour average_length
12:00-12:59 some_average
1:00-1:59 some_average
2:00-2:59 some_average
Thank you!
Upvotes: 0
Views: 767
Reputation: 13680
Using the xml2
package and assuming text
as in your example, we use
xml_obj <- read_xml(text)
to create a xml object.
We can navigate in this objects using the various functions in the library that you can read about here. In this particular case we want to find all the elements of types time
and length
and then bind them in a data.frame
.
# Find all elements of type time
times <- xml_find_all(xml_obj, '//time') %>% xml_text()
# Find all elements of type length
lengths <- xml_find_all(xml_obj, '//length') %>% xml_text()
# Merge the two to create the final data.frame
final <- data.frame(time = times, length = lengths)
Hope this helps
Upvotes: 1