Mohamed Yusuf
Mohamed Yusuf

Reputation: 428

Can't extract href link from html_node in rvest

When I use the rvest package xpath and and try to get the embedded links (football team names) from the sites I get an empty result. Could someone help this?

The code is as follows:

library(rvest)
 
url <- read_html('https://www.transfermarkt.com/premier-league/startseite/wettbewerb/GB1') 
    
xpath <- as.character('/html/body/div[2]/div[11]/div[1]/div[2]/div[2]/div')

url %>%
  html_node(xpath=xpath) %>% 
  html_attr('href')

Upvotes: 0

Views: 699

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388982

You can get all the links using :

library(rvest)

url <- 'https://www.transfermarkt.com/premier-league/startseite/wettbewerb/GB1'


url %>%
  read_html %>%
  html_nodes('td.hauptlink a') %>%
  html_attr('href') %>%
  .[. != '#'] %>%
  paste0('https://www.transfermarkt.com', .) %>%
  unique() %>%
  head(20)

Upvotes: 1

Related Questions