Reputation: 43
I would like to substract a vector of means from the original values. I cannot figure out, how to map the corresponding conditions of the means and values. So far i tried it with arranging the values correctly, but even there i fail.
library("reshape")
require('plyr')
require("dplyr")
The dataframe:
n <- as.factor(rep(c(1:16), times=2))
s <- as.factor(rep(c("ja","nein"), each=8, times=2))
b <- as.factor(rep(c("red", "green","blue", "pink"),times=8))
zahl <- runif(32)
df <- data.frame(n, s, b, zahl)
the means as a column:
df.mean <- melt(data.frame(cast(df, b~s, mean)), id=1, measured=2:3)
my wrong version:
df.final <- df%>%
mutate(r=1:32,
trial=rep(1:2, each=16))%>%
#arrange(r,n,trial,s,b)%>% # this does't arrange the "ja, nein" eaqual to the means
mutate(mean.bs=rep(df.mean[,3], times=4),
diff=zahl-mean.bs)
the results should be like:
n s b zahl trial mean.bs diff
1 1 ja red 0.49 1 0.8025 -0.3125
2 2 ja green 0.59 1 0.6200 -0.0300
3 3 ja blue 0.97 1 0.3175 0.6525
4 4 ja pink 0.04 1 0.5225 -0.4825
5 9 nein red 0.x 1 0.4775 0.x
6 10 nein green 0.x 1 0.3975 0.x
7 11 nein blue 0.x 1 0.5625 0.x
8 12 nein pink 0.x 1 0.3925 0.x
9 5 ja red 0.x 1 0.8025 -0.x # here means repeat
10 6 ja green 0.x 1 0.6200 -0.x
...
And maybe there is a more precise way to do it? (with condition ...)
thank you!
Upvotes: 1
Views: 217
Reputation: 10411
Ok I'm not 100% sure that's what you want to achieve (setting seed
before using randomized data is a good idea), but try this (picking up after your df.mean <- ...
line:
colnames(df.mean) <- c("b","s","mean.bs")
df$trial <- rep(1:2, each=16)
df2 <- merge(df, df.mean, by=c("b", "s"))
df2$diff <- df2$zahl - df2$mean.bs
df2 <- df2[order(df2$trial, df2$n),]
rownames(df2) <- NULL
head(df2)
b s n zahl trial mean.bs diff
1 red ja 1 0.87370077 1 0.6972817 0.1764190
2 green ja 2 0.01389495 1 0.4272126 -0.4133177
3 blue ja 3 0.96772185 1 0.5276125 0.4401094
4 pink ja 4 0.80911187 1 0.3625441 0.4465678
5 red ja 5 0.47676424 1 0.6972817 -0.2205175
6 green ja 6 0.07390932 1 0.4272126 -0.3533033
Upvotes: 1
Reputation: 887168
We can get the difference within the mutate
itself
library(dplyr)
df %>%
group_by(b,s) %>%
mutate(mean.bs= mean(zahl), diff= zahl-mean.bs)
Upvotes: 1