Reputation: 1
I am attempting to scrape this website url= https://www.kimovil.com/en/compare-smartphones/f_min_dm+unveileddate.3,i_b+slug.samsung
I am using rvest to scrape this website. This is the code I am using.
site <- 'https://www.kimovil.com/en/compare-smartphones/f_min_dm+unveileddate.3,i_b+slug.samsung'
website <- read_html(site)
device_label_html <- html_nodes(website,'div.device-name')
device_label <- html_text(device_label_html)
head(device_label,n=60)
Once I run this code, it takes up to 40 results (phones) although it is supposed to be 51 results (phones). Can someone help me on this. Thank you.
Upvotes: -1
Views: 61
Reputation: 2414
The website is paged internally. There might be a more elegant way to do this. I would definitely look for one if it were more than 2 pages, but this works:
library(rvest)
site <- 'https://www.kimovil.com/en/compare-smartphones/f_min_dm+unveileddate.3,i_b+slug.samsung'
website <- read_html(site)
device_label_html <- html_nodes(website,'div.device-name')
device_label <- html_text(device_label_html)
site2 <- 'https://www.kimovil.com/en/compare-smartphones/f_min_dm+unveileddate.3,i_b+slug.samsung,page.2'
website2 <- read_html(site2)
device_label_html2 <- html_nodes(website2,'div.device-name')
device_label2 <- html_text(device_label_html2)
head(c(device_label, device_label2),n=60)
Upvotes: 1