slackline
slackline

Reputation: 2417

Define Factor variable with unobserved levels

I'm working with a dataset that has the following structure...

grades <- c("7A", "8B", "6C", "6B+")

...however there are a number of currently unobserved levels not in my dataset. But I do not wish to have the factors defined automatically (so am using read.csv(..., stringsAsFactors=FALSE) when reading in my data). I would like to explicitly define the levels and their labels and convert the imported strings to be an ordered factor so that all grades are represented with associated counts of zero if none are observed.

real.grades  <- ordered(x = character(), 
                        levels = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17),                       
                        labels = c("6A", "6A+", "6B", "6B+", "6C", "6C+", "7A", "7A+", "7B", "7B+", "7C", "7C+", "8A", "8A+", "8B", "8B+", "8C"))

...but am struggling as to how to do this?

Suggestions and pointers gratefully received, thanks in advance.

Upvotes: 2

Views: 538

Answers (1)

Tyler Rinker
Tyler Rinker

Reputation: 109894

I think this is what you're after:

grades <- c("7A", "8B", "6C", "6B+")

real.grades  <- factor(grades, levels = c("6A", "6A+", "6B", "6B+", "6C", 
    "6C+", "7A", "7A+", "7B", "7B+", "7C", "7C+", "8A", "8A+", "8B", 
    "8B+", "8C"))   

Yielding:

> real.grades 
[1] 7A  8B  6C  6B+
Levels: 6A 6A+ 6B 6B+ 6C 6C+ 7A 7A+ 7B 7B+ 7C 7C+ 8A 8A+ 8B 8B+ 8C

For numeric representations use:

as.numeric(real.grades)

Upvotes: 2

Related Questions