Maël
Maël

Reputation: 52209

Applying a long list of labels in a factor in R

Usually, when I want to label a factor, I use the apply_labels function from expss and I add manually all labels, like this :

apply_labels(df,
                var1=c("label1"=1,"label2"=2,"label3"=3),
                var2=c("label4"=1,...),
                ...)

But in my current case, I have an unlabellised factor df$PAVEUN that have 417 possible values. On another table (df2), I have all unique values with corresponding label (df2$ENGLISH). An overview of this dataframe is the following:

> head(df2)
  CODE                                            ENGLISH
1    1                                           Managers
2   11 Chief executives, senior officials and legislators
3  111                   Legislators and senior officials
4 1111                                        Legislators
5 1112                        Senior government officials
6 1113            Traditional chiefs and heads of village

How can I label df$PAVEUN with df2$ENGLISH without having to do it manually?

Upvotes: 1

Views: 476

Answers (2)

Gregory Demin
Gregory Demin

Reputation: 4846

For labelled variables the code below should do the trick:

apply_labels(df,
                paven=setNames(df2$CODE, df2$ENGLISH),

                ...)

Generally speaking, labelled variables and factors are different things. Code for factors will look like this:

df$paven_factor = factor(df$paven, levels = df2$CODE, labels = df2$ENGLISH) 

Upvotes: 1

Ben Bolker
Ben Bolker

Reputation: 226557

I think levels(df$PAVEUN) <- df2$ENGLISH) will do what you want. However, it's up to you to make sure that the order of the levels corresponds correctly ... if the values of df2$CODE match the values in df$PAVEUN you might want to use merge() (from base R) or one of the *_join() functions from tidyverse to be more careful.

Upvotes: 1

Related Questions