MYaseen208
MYaseen208

Reputation: 23938

Reordering levels of a factor in R data.frame

This is a simple example of what I'm facing. I've my factor levels B-1, B-2, B-9, B-10, B-11 and want to arrange them in above given order. Here I can easily rearrange the levels, however, in my data I do have complex structure and would like to do it through some coding. I wonder how to arrange these factor levels in their logical order.

set.seed(12345)
f <- rep(c("B-1", "B-2", "B-9", "B-10", "B-11"), each=3)
Y <- runif(n=15, min=100, max=1000)
df <- data.frame(f, Y)


levels(df$f)
[1] "B-1"  "B-10" "B-11" "B-2"  "B-9"

library(gtools)
mixedsort(df$f)

[1] B-1  B-1  B-1  B-10 B-10 B-10 B-11 B-11 B-11 B-2  B-2  B-2  B-9  B-9  B-9 

Levels: B-1 B-10 B-11 B-2 B-9

df2 <- df[mixedorder(df$f), ]


df3 <- within(df, 
         Position <- factor(f, 
                          levels=names(sort(table(f), 
                                            decreasing=TRUE))))

levels(df3$Position)
[1] "B-1"  "B-10" "B-11" "B-2"  "B-9" 

Edited

Now I can have the solution to this question which was closed immediately as I posted it. Thanks @akrun for your help.

Upvotes: 2

Views: 2864

Answers (2)

alexwhitworth
alexwhitworth

Reputation: 4907

An alternative, though IMO worse, solution is to use the native stats::relevel function. However, that only allows you to provide a new reference level (see the last line of the source code to stats:::relevel.factor), so you need to call it recursively.

rev_levels <- gtools::mixedsort(levels(df$f))

for (i in 1:length(rev_levels)) {
  df$f <- relevel(df$f, ref= rev_levels[i])
}

levels(df$f)
[1] "B-1"  "B-2"  "B-9"  "B-10" "B-11"

I am mainly posting this solution to show what is, in my mind, a flaw in a base-R function/solution. At a minimum, the function is poorly named. It doesn't truly relevel, it simply re-reference-levels

Upvotes: 1

akrun
akrun

Reputation: 887831

We can specify the levels as the mixedsorted levels of the 'f' column.

 df$f <- factor(df$f, levels=mixedsort(levels(df$f), decreasing=TRUE))
 levels(df$f)
 #[1] "B-1"  "B-2"  "B-9"  "B-10" "B-11"

Or as suggested by @Ben Bolker, a variation would be

 df <- transform(df,f=factor(f,levels=mixedsort(levels(f), 
          decreasing=TRUE)))

and I guess - is interpreted as minus sign as @Gregor suggested in the comments.

Upvotes: 7

Related Questions