Retrieving table data from html doc in R

Question

I am trying to retrieve brand data for the most powerful brands from http://www.forbes.com/powerful-brands/list/#tab:rank. When I initially failed to retrieve data using getURL and `HtmlParse I understood that the table data is coming from some other link. So to make things easy I downloaded the html page and tried to retrieve the data.
I initially tried using

library(XML)
library(RCurl)
library(ggplot2)
forbes <- readHTMLTable("forbes.html",header = TRUE,as.data.frame = TRUE)
forbes

Now when I display forbes I get a list. I had though I would get a dataframe instead.

I checked in the list to find data of the top 10 brands in forbes$the_list, but did not find the rest of the data of the rest of the companies. i.e. beyond top 10 companies.

How can I retrieve all the tabular data from the forbes page and how can I convert it to a data frame for my manipulation.

Please let me know if you need any further info.

Retrieving table data from html doc in R

Answers (1)

Related Questions