Reputation: 803
I would like to read a txt table inside an URL. The table does have 3 columns; the second column is a character column with few words in it with quotations around words. The data cannot be accessible by public, that is why I cannot give the link here, but I give an example of how data look like when you open the http link:
col1 "column second"
col3
1 "a city name" 2323
20 second 4343
30 "third row" 43434
'col1','"column second"','col3' are column names and this is how the header looks like in the real URL. I tried few read functions such as read_delim(), readline(), read.table and fread, but none of them could read the data correctly. When I download or copy/paste in a file, it works without any problem, but fails when I want to read directly from the URL. The problem is with the "" in the second column. For example, if I set sep=" ", the first row of the data has 5 columns, the second row 3 columns and 3rd row 4 columns.
I appreciate your kind help.
Upvotes: 0
Views: 462
Reputation: 803
The answer by Grothendieck is perfect. I just found another solution, https://www.r-bloggers.com/getting-data-from-an-online-source/, for those who migh interest in reading url tables.
library(RCurl)
# The url link provided in the comment by Grothendieck
url <- 'https://raw.githubusercontent.com/CSSEGISandData/COVID-
19/master/archived_data/archived_daily_case_updates/02-12-2020_1020.csv'
myfile <- getURL(url, ssl.verifyhost=FALSE, ssl.verifypeer=FALSE)
mydat <- read.csv(textConnection(myfile), header=T)
head(mydat)
The problem with my url, in the question post, was that the data was not in a raw format; like a file in onedrive or google drive. It can go in another question; or please welcome to share your answers or link here, to read such type of data.
Upvotes: 0
Reputation: 269860
Use scan
to read in the data into a character vector s
and reform all but the first 3 elements into a matrix and then data frame DF
using those 3 elements as the column names. Finally convert the types of each column of DF
. We have used scan
to read from Lines
shown in the Note at the end but it can also read from a file or connection using the file=
argument of scan
. No packages are used.
s <- scan(text = Lines, what = "", quiet = TRUE)
DF <- setNames(as.data.frame(matrix(tail(s, -3),, 3, byrow = TRUE)), s[1:3])
DF[] <- lapply(DF, type.convert)
giving:
> DF
col1 column second col3
1 1 a city name 2323
2 20 second 4343
3 30 third row 43434
Input in reproducible form:
Lines <- 'col1 "column second"
col3
1 "a city name" 2323
20 second 4343
30 "third row" 43434'
Upvotes: 1