Alex Ho
Alex Ho

Reputation: 438

Reshaping Data in a dataframe in R

Are there any way to re-shape data in the following format?

    Date       Student Test.1 Test.2 Test.3 
    2007/02/01   A      80      90     70  
    2007/02/01   B      90      60     90  
    2007/02/01   C      75      70     80  
    2007/02/01   D      50      80     70  

To the following format?

    Date       Student Result  Test 
    2007/02/01   A      80       1   
    2007/02/01   A      90       2   
    2007/02/01   A      70       3   
    2007/02/01   B      90       1   
    2007/02/01   B      60       2   
    2007/02/01   B      90       3   
    2007/02/01   C      75       1   
    2007/02/01   C      70       2   
    2007/02/01   C      80       3   
    2007/02/01   D      50       1   
    2007/02/01   D      80       2
    2007/02/01   D      70       3      

Upvotes: 1

Views: 98

Answers (3)

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193687

To complete the roundup of common approaches, you can look at "dplyr" + "tidyr", which can be used together like this:

library(dplyr)
library(tidyr)

mydf %>% gather(Time, Score, starts_with("Test"))
#          Date Student   Time Score
# 1  2007/02/01       A Test.1    80
# 2  2007/02/01       B Test.1    90
# 3  2007/02/01       C Test.1    75
# 4  2007/02/01       D Test.1    50
# 5  2007/02/01       A Test.2    90
# 6  2007/02/01       B Test.2    60
# 7  2007/02/01       C Test.2    70
# 8  2007/02/01       D Test.2    80
# 9  2007/02/01       A Test.3    70
# 10 2007/02/01       B Test.3    90
# 11 2007/02/01       C Test.3    80
# 12 2007/02/01       D Test.3    70

To get the specific form you're looking for, you can go a couple of steps further with separate and select:

mydf %>% 
  gather(Time, Score, starts_with("Test")) %>% 
  separate(Time, c("Stub", "Test")) %>% 
  select(-Stub)
#          Date Student Test Score
# 1  2007/02/01       A    1    80
# 2  2007/02/01       B    1    90
# 3  2007/02/01       C    1    75
# 4  2007/02/01       D    1    50
# 5  2007/02/01       A    2    90
# 6  2007/02/01       B    2    60
# 7  2007/02/01       C    2    70
# 8  2007/02/01       D    2    80
# 9  2007/02/01       A    3    70
# 10 2007/02/01       B    3    90
# 11 2007/02/01       C    3    80
# 12 2007/02/01       D    3    70

Upvotes: 1

Barranka
Barranka

Reputation: 21067

The melt() function may be helpful:

library(reshape)
md <- melt(df, id=c('Date','Student')

The resulting 'melted' data frame will be something like this:

      Date Student variable value
2007/02/01       A    Test.1   80
2007/02/01       B    Test.1   90
2007/02/01       C    Test.1   75
2007/02/01       D    Test.1   50
2007/02/01       A    Test.1   90
...

You can then rename the columns and/or modify the values to fit your needs.

The melted data frame can then be used with the cast() function to create pivot-like data frames. Check the Quick-R tutorial: Reshaping data.

Upvotes: 1

Matthew Lundberg
Matthew Lundberg

Reputation: 42689

Something like this will do:

reshape(x, direction='long', varying=paste('Test', 1:3, sep='.'))
          Date Student time Test id
1.1 2007/02/01       A    1   80  1
2.1 2007/02/01       B    1   90  2
3.1 2007/02/01       C    1   75  3
4.1 2007/02/01       D    1   50  4
1.2 2007/02/01       A    2   90  1
2.2 2007/02/01       B    2   60  2
3.2 2007/02/01       C    2   70  3
4.2 2007/02/01       D    2   80  4
1.3 2007/02/01       A    3   70  1
2.3 2007/02/01       B    3   90  2
3.3 2007/02/01       C    3   80  3
4.3 2007/02/01       D    3   70  4

You can then rename the columns as necessary. Note that the time column here is what you label as Test in your desired output. This is how the columns in the wide format are differentiated in the long format.

Upvotes: 1

Related Questions