randomly selecting between two columns of data in a table in R

Question

so I have a table which contains data on a subject taking two versions of a test. What I would like to do is write some code that allows me to randomly select which version of the test to include in the final table and which to discard. Here is some example data:

ID     test1    test2

38762   21       36
37874   17       20
37813   15       17
37738   23       31
37470   25       36
37308   31       32
37039   25       16
36045   16        9

I need this to be as close to random as possible, so any help would be greatly appreciated.

Thanks in advance

EDIT: Desired output:

row.names   ID  test1
    67  38762   21
    218 36045   16


row.names   ID  test2
    108 37874   20
    114 37813   17
    117 37738   31
    140 37470   36
    152 37308   32
    175 37039   16

Michael Kaiser · Accepted Answer

You could something like this: start out by making your three columns a data frame, if the aren't already. Then subset that data frame according to a random vector of 0s and 1s you generated.

 df <- cbind(ID, test1, test2)
 #make vector of 0s and 1s of the length = number of rows of df 
 ran <- sample(c(0,1), nrow(df), replace = TRUE) 

 group1 <- subset(subset(df, select = c(ID, test1)), subset = ran == 0)
 group2 <- subset(subset(df, select = c(ID, test2)), subset = ran == 1)

randomly selecting between two columns of data in a table in R

Answers (2)

Related Questions