Reputation: 1145
Have a dataframe which I want to export to CSV and re-import to dataframe. When importing one column is corrupted -- by removing the colon from the end of the strings, and interpreting them as numeric.
Here a minimal example:
df <- data.frame(integers = c(1:8, NA, 10L),
doubles = as.numeric(paste0(c(1:7, NA, 9, 10), ".1")),
strings = paste0(c(1:10),".")
)
df
str(df) # here the last column is "chr"
write.table(df,
file = "df.csv",
sep = "\t",
na = "NA",
row.names = FALSE,
col.names = TRUE,
fileEncoding = "UTF-8",
)
df <- read.table(file = "df.csv",
header = TRUE,
sep = "\t",
na.strings = "NA",
quote="\"",
fileEncoding = "UTF-8"
)
df
str(df) # here the last column is "num"
Upvotes: 3
Views: 1934
Reputation: 887691
With read.table
, we can specify the colClasses
specified in ?vector
The atomic modes are "logical", "integer", "numeric" (synonym "double"), "complex", "character" and "raw".
The issues is that ?read.table
colClasses
uses type.convert
if not specified to automatically judge the type of the column
Unless colClasses is specified, all columns are read as character columns and then converted using type.convert to logical, integer, numeric, complex or (depending on as.is) factor as appropriate.
The relevant code in read.table
would be
...
do[1L] <- FALSE
for (i in (1L:cols)[do]) {
data[[i]] <- if (is.na(colClasses[i]))
type.convert(data[[i]], as.is = as.is[i], dec = dec,
numerals = numerals, na.strings = character(0L))
else if (colClasses[i] == "factor")
as.factor(data[[i]])
else if (colClasses[i] == "Date")
as.Date(data[[i]])
else if (colClasses[i] == "POSIXct")
as.POSIXct(data[[i]])
else methods::as(data[[i]], colClasses[i])
}
...
df <- read.table(file = "df.csv",
header = TRUE,
sep = "\t",
na.strings = "NA",
quote="\"",
fileEncoding = "UTF-8",
colClasses = c("integer", "numeric", "character")
)
-checking the struture
str(df)
'data.frame': 10 obs. of 3 variables:
$ integers: int 1 2 3 4 5 6 7 8 NA 10
$ doubles : num 1.1 2.1 3.1 4.1 5.1 6.1 7.1 NA 9.1 10.1
$ strings : chr "1." "2." "3." "4." ...
Upvotes: 3