Sininho
Sininho

Reputation: 296

Sort list on numeric values stored as factor

I have 4 data frames with data from different experiments, where each row represents a trial. The participant's id (SID) is stored as a factor. Each one of the data frames look like this:

Experiment 1:

SID     trial   measure
 5402       1  0.6403791
 5402       2 -1.8515095
 5402       3 -4.8158912
25403       1         NA
25403       2 -3.9424822
25403       3 -2.2100059

I want to make a new data frame with the id's of the participants in each of the experiments, for example:

   Exp1        Exp2        Exp3        Exp4
    5402       22081       22160       25434
   25403       22069       22179       25439
   25485       22115       22141       25408
   25457       22120       22185       25445
   28041       22448       22239       25473
   29514       22492       22291       25489

I want each column to be ordered as numbers, that is, 2 comes before 10.

I used unique() to extract the participant id's (SID) in each data frame, but I am having problems ordering the columns.

I tried using:

data.frame(order(unique(df1$SID)),
           order(unique(df2$SID)),
           order(unique(df3$SID)),
           order(unique(df4$SID)))

and I get (without the column names):

38  60  16  32  15
2   9   41  14  41
3   33  5   30  62
4   51  11  18  33

I'm sorry if I am missing something very basic, I am still very new to R.

Thank you for any help!

Edit: I tried the solutions in the comments, and now I have:

x<-cbind(sort(as.numeric(unique(df1$SID)),decreasing = F), 
         sort(as.numeric(unique(df2$SID)),decreasing = F), 
         sort(as.numeric(unique(df3$SID)),decreasing = F), 
         sort(as.numeric(unique(df4$SID)),decreasing = F) )

Still does not work... I get:

    V1  V2  V3  V4
    8   6   5   2   
2   9   35  11  3
3   10  37  17  184
4   13  38  91  185
5   15  39  103 186

The subject id's are 3 to 5 digit numbers...

Upvotes: 1

Views: 625

Answers (2)

C8H10N4O2
C8H10N4O2

Reputation: 19005

If your data looks like this:

df <- read.table(text="
  SID     trial   measure
 5402       1  0.6403791
 5402       2 -1.8515095
 5402       3 -4.8158912
25403       1         NA
25403       2 -3.9424822
25403       3 -2.2100059",
header=TRUE, colClasses = c("factor","integer","numeric"))

I would do something like this:

df <- df[order(as.numeric(as.character(df$SID)), trial),] # sort df on SID (numeric) & trial

split(df$SID, df$trial) # breaks the vector SID into a list of vectors of SID for each trial

If you were worried about unique values you could do:

lapply(split(df$SID, df$trial), unique) # breaks SID into list of unique SIDs for each trial

That will give you a list of participant IDs for each trial, sorted by numeric value but maintaining their factor property.

If you really wanted a data frame, and the number of participants in each experiment were equal, you could use data.frame() on the list, as in: data.frame(split(df$SID, df$trial))

Upvotes: 2

tushaR
tushaR

Reputation: 3116

Suppose x and y represent the Exp1 SID and Exp2 SID. You can create a ordered list of unique values as shown below:

x<-factor(x = c(2,5,4,3,6,1,4,5,6,3,2,3))
y<-factor(x = c(2,3,4,2,4,1,4,5,5,3,2,3))
list(exp1=sort(x = unique(x),decreasing = F),y=sort(x = unique(y),decreasing = F))

Upvotes: 0

Related Questions