Dom
Dom

Reputation: 1053

Ordering categorical variables in a dataframe

How to you change the order in which factors are displayed in a dataframe?

Example data using a sample of Australian state names:

location <- c("new_south_wales", "victoria", "queensland")

Say I want to have victoria appear last!

#this doesn't work
factor(location, levels = c("new_south_wales", "queensland", "victoria")

#neither does this
ordered(location, levels = c("new_south_wales", "queensland", "victoria")

Also tried forcats::fct_relevel but, while I can change levels, it still doesn't have an impact on the order in which the factors are displayed.

Upvotes: 2

Views: 14089

Answers (1)

De Novo
De Novo

Reputation: 7610

If you want the actual factor to be ordered alphanumerically, you can sort it that way.

location <- c("new_south_wales", "victoria", "queensland")
factor(sort(location))
# [1] new_south_wales queensland      victoria       
# Levels: new_south_wales queensland victoria

You can, of course, do this before or after you create it.

states <- factor(location)
states
# [1] new_south_wales victoria        queensland     
# Levels: new_south_wales queensland victoria

sort(states)
# [1] new_south_wales queensland      victoria       
# Levels: new_south_wales queensland victoria

ordered_states <- sort(states)
ordered_states
# [1] new_south_wales queensland      victoria       
# Levels: new_south_wales queensland victoria

You can also order them in some other order:

states <- factor(location[c(3, 2, 1])
states
# [1] queensland      victoria        new_south_wales
# Levels: new_south_wales queensland victoria

# Or after the fact:
states <- factor(states[c(3, 1, 2])
states
# [1] victoria        queensland      new_south_wales
# Levels: new_south_wales queensland victoria
# Notice that this reorders the reordered states, because that's how
# states was last assigned.

Levels are sorted alphanumerically by default, but this has no effect on the actual order of the values in the factor (as you demonstrated).

As you also demonstrated, an ordered factor is not necessarily displayed in order. That just means that the values are ordinal

Upvotes: 3

Related Questions