Reputation: 1233
I'm trying to get sold dates from eBay using R and RVest web scraping
The url is url
literally
The full xpath to the first item sold date is: //*[@id="srp-river-results"]/ul/li[1]/div/div[2]/div[2]/div/span/span[1]
If I use that and then html_text() to this path, I get nothing. character(0)
When I remove the spans, and add the POSITIVE node, I get the date, but also a bunch of extra text.
R code:
readHTML <- url %>%
read_html()
SoldDate <- readHTML %>%
html_nodes(xpath='//*[@id="srp-river-results"]/ul/li[1]/div/div[2]/div[2]/div') %>%
html_nodes("[class='POSITIVE']") %>%
html_text(trim = TRUE)
Result:
"SoYlPd N Feb 316,Z RM9USI2021"
I should get:
"Feb 16, 2021"
Upvotes: 0
Views: 54
Reputation: 1233
There are 2 great answers with more detail specifics on the issue here: Rvest Split Data by Class Name where the class names change
Upvotes: 0