Imlerith
Imlerith

Reputation: 489

fread and read.table giving dissimilar outputs according to identical() in R

I have a text file containing space delimited characters say:

a b c d e

I am reading this .txt file using two different ways and then I compare the two readings as follows:

a <- fread("C:/Users/user/Desktop/New Text Document.txt",header=FALSE,data.table=FALSE)
b <- read.table("C:/Users/user/Desktop/New Text Document.txt")

> identical(a,b)
[1] FALSE

Why is the outcome of identical() for these two readings is dissimilar when I know that both a and b contain the same data, have the same class, and same str? is it a problem with fread or a problem with the identical function?

Upvotes: 0

Views: 125

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226182

tl;dr use as.is to prevent read.table from converting strings to factors, or stringsAsFactors=TRUE to enable the same behaviour in fread. Use str to figure this out.

writeLines("a b c d e",con="tmpabc.txt")
str(A <- read.table("tmpabc.txt",header=FALSE))
'data.frame':   1 obs. of  5 variables:
 $ V1: Factor w/ 1 level "a": 1
 $ V2: Factor w/ 1 level "b": 1
 $ V3: Factor w/ 1 level "c": 1
 $ V4: Factor w/ 1 level "d": 1
 $ V5: Factor w/ 1 level "e": 1
str(B <- data.table::fread("tmpabc.txt",header=FALSE,data.table=FALSE))
'data.frame':   1 obs. of  5 variables:
 $ V1: chr "a"
 $ V2: chr "b"
 $ V3: chr "c"
 $ V4: chr "d"
 $ V5: chr "e"

C <- read.table("tmpabc.txt",header=FALSE,as.is=TRUE)
identical(B,C)   ## TRUE

Upvotes: 4

Related Questions