Error404
Error404

Reputation: 7131

Custom sorting of a dataframe in R

I have a binomail dataset that looks like this:

df <- data.frame(replicate(4,sample(1:200,1000,rep=TRUE)))
addme <- data.frame(replicate(1,sample(0:1,1000,rep=TRUE)))
df <- cbind(df,addme)
df <-df[order(df$replicate.1..sample.0.1..1000..rep...TRUE..),]

The data is currently soreted in a way to show the instances belonging to 0 group then the ones belonging to the 1 group. Is there a way I can sort the data in a 0-1-0-1-0... fashion? I mean to show a row that belongs to the 0 group, the row after belonging to the 1 group then the zero group and so on...

All I can think about is complex functions. I hope there's a simple way around it.

Thank you,

Upvotes: 1

Views: 434

Answers (2)

thelatemail
thelatemail

Reputation: 93938

Here's an attempt, which will add any extra 1's at the end:

First make some example data:

set.seed(2)
df <- data.frame(replicate(4,sample(1:200,10,rep=TRUE)),
                              addme=sample(0:1,10,rep=TRUE))

Then order:

with(df, df[unique(as.vector(rbind(which(addme==0),which(addme==1)))),])

#    X1  X2  X3  X4 addme
#2  141  48  78  33     0
#1   37 111 133   3     1
#3  115 153 168 163     0
#5  189  82  70 103     1
#4   34  37  31 174     0
#6  189 171  98 126     1
#8  167  46  72  57     0
#7   26 196  30 169     1
#9   94  89 193 134     1
#10 110  15  27  31     1
#Warning message:
#In rbind(which(addme == 0), which(addme == 1)) :
#  number of columns of result is not a multiple of vector length (arg 1)

Upvotes: 3

Gregor Thomas
Gregor Thomas

Reputation: 146239

Here's another way using dplyr, which would make it suitable for within-group ordering. It's also probably pretty quick. If there's unbalanced numbers of 0's and 1's, it will leave them at the end.

library(dplyr)
df %>% 
    arrange(addme) %>%
    mutate(n0 = sum(addme == 0),
           orderme = seq_along(addme) - (n0 * addme) + (0.5 * addme)) %>%
    arrange(orderme) %>%
    select(-n0, -orderme)

Upvotes: 3

Related Questions