user8195347
user8195347

Reputation: 65

Why can't we plot this simple text against values in R?

I gleaned some information from an HTML table online using the XML package:

library("XML")
library("RCurl")
library("rlist")
theurl = getURL("http://www.victoria2wiki.com/Countries_table", .opts = list(ssl.verifypeer = FALSE))
tables <- readHTMLTable(theurl, as.data.frame = TRUE)

tables now holds a list containing information from the table on the page. Then we convert this list to a dataframe by using:

df <- do.call(rbind.data.frame, tables)

names(df) shows

[1] " Country\n"    " Tier\n"       " Population\n" " Literacy\n" 

df[,3] shows all of the population numbers. We tried to plot it using:

> plot(df[,3]), but the graph is incorrect and shows population numbers on X-axis and does not make sense.

How do we plot country names against their population given our simple R data frame? What we want is a simple line plot of populations on Y-axis and names of countries on X-axis.

Upvotes: 1

Views: 32

Answers (1)

Marco Sandri
Marco Sandri

Reputation: 24252

Here is a possible solution:

library("XML")
library("RCurl")
library("rlist")
theurl = getURL("http://www.victoria2wiki.com/Countries_table", .opts = list(ssl.verifypeer = FALSE))
tables <- readHTMLTable(theurl, as.data.frame = TRUE)

# tables is a list with two elements
# The data frame is stored in the second element of this list
df <- tables[[2]]
colnames(df) <- c("Country", "Tier", "Population", "Literacy")

# Population is a factor and needs to be converted into a numeric vector
par(mar=c(3,7,1,1))
barplot(as.numeric(gsub(",", "", df$Population)), 
        names.arg=df$Country, horiz=T, las=1, cex.names=0.6)

enter image description here

Upvotes: 2

Related Questions