Guy Dubrovski
Guy Dubrovski

Reputation: 1560

R after replacing read.csv with fread incorrect number of dimensions error appears

I was loading my csv file with plain:

baseData <- read.csv(datafile)

but as I want to load larger dataset I have moved to data.table package

baseData <- fread(input = paste("zcat < ", datafile, sep=""))

all seems to work fine, and the data loads much faster, but when I hit the following line:

d <- baseData[baseData$some_prop==0,]
d <- d[!is.na(d[,"col"]) & (d[,"col"] == 0 | d[,"col"] == 1),]

I get error for incorrect number of dimensions

when using read.csv all is working fine. Any idea what can get wrong ?

Upvotes: 0

Views: 469

Answers (1)

Tensibai
Tensibai

Reputation: 15784

In a data.table the j part of the subsetting is meant to return a new value and the columns names should not be quoted or you'll get back exactly this value.

Example:

>d<-data.table(A=1:5,B=5:10)
> d[,A]
[1] 1 2 3 4 5 1
> d[,B]
[1]  5  6  7  8  9 10
> d[,"B"]
[1] "B"

So for you particular case, removing the quotes around the columns names should fix the error.

If your code is quite long and use data.frame methods, you can use setDF(d) to make it work as-is before refactoring it.

To be complete, the error arise because your logical statement is of length 1 ("col" == whatever does just return one value TRUE or FALSE), not matching the number of rows of your data.table object.

Upvotes: 1

Related Questions