Reputation:
I have this dataframe. What I'm looking for is to extract id value od provider if name is goldman. Please notice that in some rows is no goldnam provider so result should be NA.
df <-
data.frame(
id = c(1, 2, 3),
xml = c(
as.character(
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<response>
<provider name=\"bank_of_ammerica\">
<success>true</success>
<id>12</id>
</provider>
<provider name=\"goldman\">
<success>true</success>
<id>13</id>
</provider>
</response>",
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<response>
<provider name=\"bank_of_ammerica\">
<success>true</success>
<id>12</id>
</provider>
<provider name=\"goldman\">
<success>true</success>
<id>16</id>
</provider>
</response>",
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>
<response>
<provider name=\"bank_of_ammerica\">
<success>true</success>
<id>12</id>
</provider>
</response>"
)
)
)
So result should be
result <-
data.frame(
id = c(1:3),
id_val = c(13, 16, NA_integer_)
)
Upvotes: 0
Views: 55
Reputation: 79208
aha. Remove the command as.character
. it messes with everything.
Anyway. You could do:
library(tidyverse)
library(rvest)
library(xml2)
df%>%
mutate(id_Val = map_chr(xml,~as_xml_document(.x)%>%
html_node("provider[name=goldman] id")%>%
html_text()))%>%
select(-xml)
id id_Val
1 1 13
2 2 16
3 3 <NA>
Upvotes: 1