heinheo
heinheo

Reputation: 565

efficiently parse binary input in R

within R I have two rows of a dataframe every number itself is stored in a separate column..

currently I am using

unname(which(df[1,]-df[2,]==0))->hte

to find spots in which the there are instances in which there are instances of row 1 equal 1 and row 2 equals 1 as well as row 1 equals 0 and row 2 equals 0. Thatakes quite a bit of time for 70k cols

Upvotes: 2

Views: 79

Answers (1)

akrun
akrun

Reputation: 887251

You could convert it to matrix by taking the transpose. It seems to be fast

 system.time({ m1 <- t(df1)
              which(m1[,1]==m1[,2])})
 #  user  system elapsed 
 #  0.31    0.00    0.31 

Or unlist

 system.time(which(unlist(df1[1,])==unlist(df1[2,])))
 #   user  system elapsed 
 #  1.175   0.002   1.177 

data

library(stringi)
write.table(stri_rand_strings(2, 70000, '[0-1]'), file='binary1.txt', 
           row.names=FALSE, quote=FALSE, col.names=FALSE)
df1 <- read.table(pipe("awk '{gsub(/./,\"& \", $1);print $1}' binary1.txt"))

Upvotes: 2

Related Questions