Luis
Luis

Reputation: 1584

Reorganize all factor levels using R and Tidyverse (to prevent my final ds be mixed)

Say I have a Likert scale in which values can be 0,1,2,3,4 or 0,2,4,6,8. They represent the same gradual aspect, but my final ds is mixed, such as: mixed ds

I would like to "reorganize" all levels to be 0,1,2,3,4. However, my aim is to have an easy way to do that without having to mutate and use case_when 0=0, 2=1, 4=2, 6=3, and 8=4.

I would like to remain within tidyverse environment. Thank you.

Dataset is below:

ds <- structure(list(young_1 = c(2, 0, 1, 2, 0, 1, 0, 0, 0, 0, 0, 0, 
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 3, 0, 0, 3, 
0, 0, 1, 0, 1, 0, 1, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
1, 0, 0, 1, 0, 0, 3, 0, 0, 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 
0, 0, 1, 0, 3, 2, 2, 0, 1, 0, 2, 0, NA, 0, 2, 3, 2, 1, 1, 0, 
0, 0, 2, 0, 0, NA, 0, 0, 1, 2, 0, 0, 1, 0, 0, 1, 0, 4, 0, 0, 
4, 1, 3, 2, 0, 2, 0, 0, 0, 0, 0, 0, 2, 0, 2, 0, 0, 0, 0, 3, 0, 
0, 0, NA, 0, 1, 0, 0, 0, 1, 0, 0, 2, 3, 0, 1, 0, 0, 0, 2, 0, 
0, 0, 0, 0, 1, 4, 0, 0, 0, 1), young_2 = c(2, 0, 0, 1, 1, 3, 
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 3, 0, 0, 0, 3, 
0, 3, 3, 0, 0, 3, 0, 0, 0, 0, 0, 0, 2, 0, 0, 3, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 2, 3, 2, 0, 1, 0, 3, 0, NA, 0, 
2, 2, 2, 0, 2, 0, 0, 0, 2, 0, 0, NA, 0, 0, 1, 1, 0, 0, 0, 0, 
3, 0, 0, 3, 0, 0, 4, 1, 2, 2, 0, 2, 0, 0, 0, 0, 0, 0, 2, 0, 1, 
0, 0, 0, 0, 2, 0, 0, 0, NA, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 
0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 1, 3, 0, 0, 4, 0), young_3 = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 2, 0, 
0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3, 0, 0, 0, 0, 
2, 0, NA, 0, 0, 3, 1, 0, 0, 0, 0, 0, 0, 2, 0, NA, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 1, 2, 4, 0, 0, 0, 0, 1, 0, 0, 2, 0, 0, 0, 0, 0, 
0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, 0, 0, 0, 1, 3, 0, 
0, 0, 0, 0, 0, 0, 2, 0, 2, 0, 2, 0, 0, 0, 0, 3, 2, 0, 2, 0), 
    young_4 = c(0, 0, 0, 2, 0, 4, 0, 0, 0, 0, 2, 0, 2, 0, 0, 
    0, 0, 0, 0, 0, 2, 2, 1, 0, 0, 0, 0, 0, 0, 3, 0, 0, 3, 0, 
    0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 1, 2, 2, 0, 0, 0, 
    1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 3, 0, 
    0, 0, 0, 2, 0, 0, 0, 3, 3, 2, 2, 0, 0, 3, 3, NA, 0, 0, 0, 
    3, 0, 0, 0, 0, 0, 2, 0, 0, NA, 0, 0, 0, 0, 2, 2, 1, 0, 2, 
    0, 2, 0, 0, 0, 4, 2, 3, 3, 0, 2, 1, 0, 0, 0, 0, 0, 0, 0, 
    2, 1, 0, 0, 2, 0, 2, 0, 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    3, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 2, 2, 0, 0, 2, 0), young_5 = c(0, 
    2, 0, 0, 0, 2, 0, 0, 2, 2, 0, 0, 6, 0, 0, 2, 0, 0, 2, 2, 
    0, 2, 4, 0, 2, 0, 4, 0, 4, 4, 0, 0, 6, 0, 0, 0, 0, 0, 0, 
    0, 2, 2, 4, 6, 2, 4, 2, 2, 0, 4, 0, 0, 0, 0, 4, 0, 0, 0, 
    0, 0, 8, 0, 0, 2, 0, 0, 0, 0, 0, 4, 4, 0, 0, 0, 0, 4, 0, 
    0, 2, 0, 2, 0, 2, 2, 0, 4, 0, NA, 0, 2, 0, 0, 0, 2, 0, 0, 
    0, 0, 4, 6, NA, 0, 0, 0, 0, 0, 6, 2, 0, 4, 2, 6, 0, 0, 0, 
    6, 0, 4, 4, 2, 4, 0, 0, 2, 4, 0, 2, 0, 2, 0, 2, 0, 0, 2, 
    4, 0, 0, 0, NA, 0, 0, 4, 2, 0, 0, 0, 0, 0, 4, 0, 2, 0, 2, 
    0, 2, 0, 0, 0, 2, 0, 4, 6, 2, 0, 4, 0), young_6 = c(4, 0, 
    0, 0, 0, 2, 0, 0, 4, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 
    0, 2, 0, 0, 0, 4, 0, 4, 6, 0, 0, 6, 0, 0, 0, 0, 0, 0, 4, 
    0, 0, 4, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 4, 0, 
    2, 6, 0, 2, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
    0, 4, 2, 4, 2, 4, 0, 4, 0, NA, 0, 2, 6, 4, 2, 0, 0, 0, 0, 
    2, 4, 0, NA, 0, 0, 0, 2, 0, 0, 0, 0, 4, 0, 0, 6, 0, 0, 6, 
    0, 4, 4, 0, 4, 0, 0, 0, 0, 0, 2, 4, 0, 2, 0, 0, 0, 2, 4, 
    0, 0, 0, NA, 4, 2, 0, 2, 0, 0, 0, 0, 2, 4, 0, 0, 0, 0, 2, 
    2, 0, 2, 0, 0, 0, 2, 8, 0, 0, 4, 0)), row.names = c(NA, -166L
), class = c("tbl_df", "tbl", "data.frame"))

Upvotes: 1

Views: 49

Answers (2)

SteveM
SteveM

Reputation: 2301

Base R

  ds[, 5:6] <- ds[, 5:6] / 2

You could use any vector of column numbers.

Upvotes: 1

akrun
akrun

Reputation: 887183

An easier option is to match the sorted unique values in each column and subtract 1 (because R indexing starts from 1)

ds[] <- lapply(ds,  function(x) match(x, sort(unique(x)))-1)

Or another option is to convert to factor and coerce to integer (the levels will be sorted) and then subtract 1.

ds[] <- lapply(ds, function(x) as.integer(factor(x)) - 1)

Or using mutate with across

library(dplyr)
ds <- ds %>%
        mutate(across(everything(), ~ as.integer(factor(.)) - 1))

Upvotes: 1

Related Questions