Reputation: 27
I have a table like this:
treatment individual phase dist_mean track
1 control 1 pre 13.33 569.99
2 control 1 post 10.95 624.65
3 control 2 pre 9.93 363.35
4 control 2 post 10.11 339.88
5 control 3 pre 12.00 676.42
6 control 3 post 12.80 939.15
In principle, two rows are always paired. I need to subtract dist_mean
of the post-phase from the pre-phase of a sample. The easy way would be to subtract row 2 from 1 and so on. But given the possibility that this order is disturbed at any point, the whole calculation would go wrong. That's why I would like to have the calculation under the conditions, that the treatment and the individual of both phases match. Info: the treatment changes. It's not always control
.
Upvotes: 2
Views: 57
Reputation: 101343
A data.table
option
setDT(df)[
order(treatment, individual, phase)
][
,
setNames(lapply(.SD, diff), paste0("diff_", names(.SD))),
by = .(treatment, individual),
.SDcols = c("dist_mean", "track")
]
gives
treatment individual diff_dist_mean diff_track
1: control 1 2.38 -54.66
2: control 2 -0.18 23.47
3: control 3 -0.80 -262.73
A base R option using reshape
transform(
reshape(
df,
direction = "wide",
idvar = c("treatment", "individual"),
timevar = "phase"
),
diff_dist_mean = dist_mean.pre - dist_mean.post,
diff_track = track.pre - track.post
)
gives
treatment individual dist_mean.pre track.pre dist_mean.post track.post
1 control 1 13.33 569.99 10.95 624.65
3 control 2 9.93 363.35 10.11 339.88
5 control 3 12.00 676.42 12.80 939.15
diff_dist_mean diff_track
1 2.38 -54.66
3 -0.18 23.47
5 -0.80 -262.73
Upvotes: 2
Reputation: 56159
Using data.table, reshape long-to-wide, then get difference in post/pre columns:
library(data.table)
setDT(df1)
dcast(df1, treatment + individual ~ phase, value.var = c("dist_mean", "track")
)[, .(treatment, individual,
diff_dist_mean = dist_mean_post - dist_mean_pre,
diff_track = track_post - track_pre)]
# treatment individual diff_dist_mean diff_track
# 1: control 1 -2.38 54.66
# 2: control 2 0.18 -23.47
# 3: control 3 0.80 262.73
Upvotes: 0
Reputation: 8880
df <- read.table(text = " treatment individual phase dist_mean track
1 control 1 pre 13.33 569.99
2 control 1 post 10.95 624.65
3 control 2 pre 9.93 363.35
4 control 2 post 10.11 339.88
5 control 3 pre 12.00 676.42
6 control 3 post 12.80 939.15", header = T)
library(tidyverse)
df %>%
pivot_wider(c(treatment, individual), names_from = phase, values_from = dist_mean) %>%
mutate(d = post - pre)
#> # A tibble: 3 x 5
#> treatment individual pre post d
#> <chr> <int> <dbl> <dbl> <dbl>
#> 1 control 1 13.3 11.0 -2.38
#> 2 control 2 9.93 10.1 0.180
#> 3 control 3 12 12.8 0.8
Created on 2021-03-09 by the reprex package (v1.0.0)
data.table
df <- read.table(text = " treatment individual phase dist_mean track
1 control 1 pre 13.33 569.99
2 control 1 post 10.95 624.65
3 control 2 pre 9.93 363.35
4 control 2 post 10.11 339.88
5 control 3 pre 12.00 676.42
6 control 3 post 12.80 939.15", header = T)
library(data.table)
setDT(df)
res <- dcast(data = df, formula = treatment + individual ~ phase, value.var = "dist_mean")[, d := post - pre]
head(res)
#> treatment individual post pre d
#> 1: control 1 10.95 13.33 -2.38
#> 2: control 2 10.11 9.93 0.18
#> 3: control 3 12.80 12.00 0.80
Created on 2021-03-09 by the reprex package (v1.0.0)
Upvotes: 0
Reputation: 76402
Use aggregate
:
aggregate(dist_mean ~ treatment + individual, df1, function(x) diff(rev(x)))
# treatment individual dist_mean
#1 control 1 2.38
#2 control 2 -0.18
#3 control 3 -0.80
Data
df1 <- read.table(text = "
treatment individual phase dist_mean track
1 control 1 pre 13.33 569.99
2 control 1 post 10.95 624.65
3 control 2 pre 9.93 363.35
4 control 2 post 10.11 339.88
5 control 3 pre 12.00 676.42
6 control 3 post 12.80 939.15
", header = TRUE)
Upvotes: 1