Reputation: 23
I have a dataset that measures pre and post weight conditions. I was trying to combine rows with similar ID's to one row and create two additional columns to show the results for pre and post for each ID.
Original table
ID Sex Race TIME SH PRE_WEIGHT POST_WEIGHT 1 FEMALE White POST Poor 168 152 1 FEMALE White PRE Good 168 152 2 MALE Hispanic POST Very Poor 157 143 2 MALE Hispanic PRE Poor 157 143 3 FEMALE Black POST Good 129 118 3 FEMALE Black PRE Satisfactory 129 118
Results should look like this:
ID Sex Race TIME SH PRE_WEIGHT POST_WEIGHT TIME_PRE TIME_POST SH_PRE SH_POST 1 FEMALE White POST Poor 168 152 POST PRE Poor Good 2 MALE Hispanic POST Very Poor 157 143 POST PRE Very Poor Poor 3 FEMALE Black POST Good 129 118 POST PRE Good Satisfactor
I tried to do it with reshape which is very slow and didn't work. I also wasn't able to reshape the table with tidyr.
Hope anyone can help me with this.
Upvotes: 0
Views: 1632
Reputation: 102625
A simple base R option might be using reshape
, e.g.,
reshape(
df,
direction = "wide",
idvar = c("ID","Sex","Race"),
timevar = "TIME"
)
which gives
ID Sex Race SH.POST PRE_WEIGHT.POST POST_WEIGHT.POST SH.PRE
1 1 FEMALE White Poor 168 152 Good
3 2 MALE Hispanic Very Poor 157 143 Poor
5 3 FEMALE Black Good 129 118 Satisfactory
PRE_WEIGHT.PRE POST_WEIGHT.PRE
1 168 152
3 157 143
5 129 118
Data
> dput(df)
structure(list(ID = c(1L, 1L, 2L, 2L, 3L, 3L), Sex = c("FEMALE",
"FEMALE", "MALE", "MALE", "FEMALE", "FEMALE"), Race = c("White",
"White", "Hispanic", "Hispanic", "Black", "Black"), TIME = c("POST",
"PRE", "POST", "PRE", "POST", "PRE"), SH = c("Poor", "Good",
"Very Poor", "Poor", "Good", "Satisfactory"), PRE_WEIGHT = c(168L,
168L, 157L, 157L, 129L, 129L), POST_WEIGHT = c(152L, 152L, 143L,
143L, 118L, 118L)), class = "data.frame", row.names = c(NA, -6L
))
Upvotes: 1
Reputation: 887851
An option with data.table
library(data.table)
dcast(setDT(df1), ID + Sex + Race + PRE_WEIGHT +
POST_WEIGHT ~ TIME, value.var = 'SH')
-output
# ID Sex Race PRE_WEIGHT POST_WEIGHT POST PRE
#1: 1 FEMALE White 168 152 Poor Good
2#: 2 MALE Hispanic 157 143 Very Poor Poor
3#: 3 FEMALE Black 129 118 Good Satisfactory
Upvotes: 1
Reputation: 3923
As @Duck mentioned you can do this
library(dplyr)
library(tidyr)
df3 %>%
group_by(ID) %>%
pivot_wider(names_from = TIME,
values_from = SH,
names_prefix = "SH_")
#> # A tibble: 3 x 7
#> # Groups: ID [3]
#> ID Sex Race PRE_WEIGHT POST_WEIGHT SH_POST SH_PRE
#> <dbl> <chr> <chr> <dbl> <dbl> <chr> <chr>
#> 1 1 FEMALE White 168 152 Poor Good
#> 2 2 MALE Hispanic 157 143 Very Poor Poor
#> 3 3 FEMALE Black 129 118 Good Satisfactory
Your data
df3 <- readr::read_table("
ID Sex Race TIME SH PRE_WEIGHT POST_WEIGHT
1 FEMALE White POST Poor 168 152
1 FEMALE White PRE Good 168 152
2 MALE Hispanic POST Very Poor 157 143
2 MALE Hispanic PRE Poor 157 143
3 FEMALE Black POST Good 129 118
3 FEMALE Black PRE Satisfactory 129 118")
Upvotes: 1