Reputation: 1560
I was loading my csv file with plain:
baseData <- read.csv(datafile)
but as I want to load larger dataset I have moved to data.table package
baseData <- fread(input = paste("zcat < ", datafile, sep=""))
all seems to work fine, and the data loads much faster, but when I hit the following line:
d <- baseData[baseData$some_prop==0,]
d <- d[!is.na(d[,"col"]) & (d[,"col"] == 0 | d[,"col"] == 1),]
I get error for incorrect number of dimensions
when using read.csv
all is working fine.
Any idea what can get wrong ?
Upvotes: 0
Views: 469
Reputation: 15784
In a data.table the j
part of the subsetting is meant to return a new value and the columns names should not be quoted or you'll get back exactly this value.
Example:
>d<-data.table(A=1:5,B=5:10)
> d[,A]
[1] 1 2 3 4 5 1
> d[,B]
[1] 5 6 7 8 9 10
> d[,"B"]
[1] "B"
So for you particular case, removing the quotes around the columns names should fix the error.
If your code is quite long and use data.frame
methods, you can use setDF(d)
to make it work as-is before refactoring it.
To be complete, the error arise because your logical statement is of length 1 ("col" == whatever
does just return one value TRUE
or FALSE
), not matching the number of rows of your data.table object.
Upvotes: 1