Max
Max

Reputation: 121

Reshaping Data Wide To Long: New variables based on Column Names

I want to reshape a dataset from wide format to long format.

The dataset contains 300 something variables and each variable is named after the principle: ModelID_Emotion_ModelGender. Sample data below:

df <- structure(list(X71_Anger_Male = structure(c(3L, 1L, 2L), .Label = c("Anger", 
"Disgust", "Fear"), class = "factor"), X71_Disgus_Male = structure(c(2L, 
1L, 1L), .Label = c("Disgust", "Fear"), class = "factor")), class = "data.frame", row.names = c(NA, 
-3L))

which looks like

  X71_Anger_Male X71_Disgus_Male
1           Fear            Fear
2          Anger         Disgust
3        Disgust         Disgust

I want to transpose the data in a way that the information from the column names gets taken and put into new variables. For example, there should be a new variable ModelGender, a new variable modelID and a new variable emotion. So the dataset should look like this:

desired <- structure(list(Gender = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Male", class = "factor"), 
    ModelNumber = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "X71", class = "factor"), 
    Emotion = structure(c(2L, 2L, 2L, 1L, 1L, 1L), .Label = c("Anger", 
    "Disgust"), class = "factor"), Response = structure(c(3L, 
    2L, 2L, 3L, 1L, 2L), .Label = c("Anger", "Disgust", "Fear"
    ), class = "factor")), class = "data.frame", row.names = c(NA, 
-6L))

which should look like

  Gender ModelNumber Emotion Response
1   Male         X71 Disgust     Fear
2   Male         X71 Disgust  Disgust
3   Male         X71 Disgust  Disgust
4   Male         X71   Anger     Fear
5   Male         X71   Anger    Anger
6   Male         X71   Anger  Disgust

When I use reshape or gather/spread or melt/cast, it does not give the desired results. Does anyone have an idea on how to do that?

Thank you for your time!

Upvotes: 0

Views: 597

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389047

In pivot_longer you can specify names_sep as "_" and split column names into 3 columns.

tidyr::pivot_longer(df, cols = everything(),
                        names_to = c('ModelNumber', 'Emotion', 'Gender'), 
                        values_to = 'Response',
                        names_sep = '_')

# A tibble: 6 x 4
#  ModelNumber Emotion Gender Response
#  <chr>       <chr>   <chr>  <fct>   
#1 X71         Anger   Male   Fear    
#2 X71         Disgus  Male   Fear    
#3 X71         Anger   Male   Anger   
#4 X71         Disgus  Male   Disgust 
#5 X71         Anger   Male   Disgust 
#6 X71         Disgus  Male   Disgust 

Upvotes: 1

Sotos
Sotos

Reputation: 51592

You can simply convert to long and split the column you want. A way via tidyverse methods can be,

library(dplyr)
library(tidyr)

df %>% 
 pivot_longer(everything()) %>% 
 separate(name, into = c('ModelNumber', 'Emotion', 'Gender'), sep = '_')

Upvotes: 1

Related Questions