emehex
emehex

Reputation: 10548

Underlying hyperlink href address from css selector

This bit of code:

library(tidyverse)
library(rvest)

url <- "http://www.imdb.com/title/tt4116284/"

director <- read_html(url) %>% 
    html_nodes(".summary_text+ .credit_summary_item .itemprop") %>% 
    html_text()

Will grab the plain text value "Chris McKay" (the director of the new LEGO Batman Movie). The underlying hyperlink href address, however, points to: http://www.imdb.com/name/nm0003021?ref_=tt_ov_dr

I want that. How can I adjust my css selector to grab the underlying hyperlink href address?

Upvotes: 1

Views: 77

Answers (1)

GGamba
GGamba

Reputation: 13680

Take the href attr of the parent a tag:

director <- read_html(url) %>% 
    html_nodes(".summary_text+ .credit_summary_item span a") %>% 
    html_attr('href')

Upvotes: 2

Related Questions