Reputation: 107
I have a dataset that I'm working with that I'm attempting to reshape using tidyverse.
From:
|Name |eval |test |type | score|
|:----|:------|:----|:---------|-----:|
|John |first |1 |pretest | 10|
|John |first |1 |posttest | 15|
|John |first |2 |pretest | 20|
|John |first |2 |posttest | 30|
|John |second |1 |pretest | 35|
|John |second |1 |posttest | 50|
|John |second |2 |pretest | 5|
|John |second |2 |posttest | 10|
|Jane |first |1 |pretest | 40|
|Jane |first |1 |posttest | 20|
|Jane |first |2 |pretest | 10|
|Jane |first |2 |posttest | 20|
To:
|Name |eval |new_name | pre_test| post_test|
|:----|:------|:-------------|--------:|---------:|
|John |first |John_first_1 | 10| 15|
|John |first |John_first_2 | 20| 30|
|John |second |John_second_1 | 35| 50|
|John |second |John_second_2 | 5| 10|
|Jane |first |Jane_first_1 | 40| 20|
|Jane |first |Jane_first_2 | 10| 20|
tried doing group_by in order to group_by Name, eval, and test so that each group would essentially be pre_test vs. post_test for a given person.
also tried using unite on Name, eval, test, and type. But if I do a spread after that then each the unique name end up being a number of columns.
also tried to doing a unite first on Name, eval, test first, and then a spread using key=(new united name) and value =Value, but the output isn't what I wanted
I know a loop function can be written to take every other value and put into a new column, but I'm trying to see if there's a tidyverse way to go about this.
Thanks!!
library(tidyverse)
Name <- c('John', 'John', 'John', 'John',
'John', 'John', 'John', 'John',
'Jane', 'Jane', 'Jane', 'Jane')
eval <- c('first', 'first', 'first', 'first',
'second', 'second', 'second', 'second',
'first', 'first', 'first', 'first')
test <- c('1', '1', '2', '2',
'1', '1', '2', '2',
'1', '1', '2', '2')
type <- c('pretest', 'posttest', 'pretest', 'posttest',
'pretest', 'posttest', 'pretest', 'posttest',
'pretest', 'posttest', 'pretest', 'posttest')
score <- c(10, 15, 20, 30, 35, 50, 5, 10, 40, 20, 10, 20)
df <- data.frame(Name, eval, test, type, score)
df %>%
unite(temp, Name, eval, test) %>%
spread(key=type, value=score)
Edit to show the original table that akrun's code worked on From:
|Name |eval |test |type | score|
|:----|:------|:----|:---------|-----:|
|John |first |1 |pretest | 10|
|John |first |1 |posttest | 15|
|John |first |2 |pretest | 20|
|John |first |2 |postttest | 30|
|John |second |1 |pretest | 35|
|John |second |1 |posttest | 50|
|John |second |2 |pretest | 5|
|John |second |2 |postttest | 10|
|Jane |first |1 |pretest | 40|
|Jane |first |1 |posttest | 20|
|Jane |first |2 |pretest | 10|
|Jane |first |2 |postttest | 20|
Upvotes: 3
Views: 148
Reputation: 887251
We can replace the multiple 't's in the 'type' column to make it same, then use unite
specify the remove = FALSE
to keep the initial columns as well and spread
library(dplyr)
library(tidyr)
library(stringr)
df %>%
mutate(type = str_replace(type, "t{2,}", "t")) %>%
unite(new_name, Name, eval, test, remove = FALSE) %>%
spread(type, score)
# new_name Name eval test postest pretest
#1 Jane_first_1 Jane first 1 20 40
#2 Jane_first_2 Jane first 2 20 10
#3 John_first_1 John first 1 15 10
#4 John_first_2 John first 2 30 20
#5 John_second_1 John second 1 50 35
#6 John_second_2 John second 2 10 5
In the new version tidyr_1.0.0
, pivot_wider
is introduced and it can be used as a more generalized version of spread
(would be deprecated in the future). So, instead of the spread
line at the end, use
...%>%
pivot_wider(names_from = type, values_from = score)
Upvotes: 4
Reputation: 530
How about something like....
data <- tibble(
Name = c(rep("John", 8), rep("Jane", 4)),
eval = c(rep("first", 4), rep("second", 4), rep("first", 4)),
type = rep(c("pretest", "posttest"), 6),
score = c(10, 15, 20, 30, 35, 50, 5, 10, 40, 20, 10, 20)
)
data %>%
group_by(Name, eval, type) %>%
mutate(num = 1:n(),
new_name = str_c(Name, "_", eval, "_", num)) %>%
ungroup() %>%
dplyr::select(new_name, type, score) %>%
spread(type, score)
Which yields:
# A tibble: 6 x 3
new_name posttest pretest
<chr> <dbl> <dbl>
1 Jane_first_1 20 40
2 Jane_first_2 20 10
3 John_first_1 15 10
4 John_first_2 30 20
5 John_second_1 50 35
6 John_second_2 10 5
Upvotes: 2