Reputation: 1205
I'm trying to read a csv from the following link: http://databank.worldbank.org/data/download/GDP.csv
I have two problems:
I thought about reading the table with the function read.fwf() to solve problems 1 and 2. However, I don't think this is a proper solution because values within some columns may vary in length (e.g. in the Country column one may find "United States" and "Italy").
Upvotes: 2
Views: 2001
Reputation: 206187
Clearly this "CSV" file has been formatted to look pretty, not to actually be useful. It's not that it has different separators, it's that it has missing columns. How about cleaning it up with something like
dd <- read.csv("http://databank.worldbank.org/data/download/GDP.csv", skip=5, header=F)[,c(1,2,4,5)]
names(dd) <- c("CountryID","Ranking","Economy","GDP")
dd<-dd[dd[,1]!="",] #get rid of rows without IDs
head(dd)
# CountryID Ranking Economy GDP
# 1 USA 1 United States 16,800,000
# 2 CHN 2 China 9,240,270
# 3 JPN 3 Japan 4,901,530
# 4 DEU 4 Germany 3,634,823
# 5 FRA 5 France 2,734,949
# 6 GBR 6 United Kingdom 2,522,261
R doesn't like commas in numbers so you'll probably also want
dd$GDP <- as.numeric(gsub(",","",dd$GDP))
Upvotes: 3