Reputation: 2030
I have a large dataframe in R with this format:
"SubjID" "HR" "IBI" "Stimulus" "Status"
"S1" 75.98 790 1 1
"S1" 75.95 791 1 2
"S1" 65.7 918 1 3
"S1" 59.63 100 1 4
"S1" 59.44 101 1 5
"S1" 59.62 101 2 1
"S1" 63.85 943 2 2
"S1" 60.75 992 2 3
"S1" 59.62 101 2 4
"S1" 61.68 974 2 5
"S2" 65.21 921 1 1
"S2" 59.23 101 1 2
"S2" 61.23 979 1 3
"S2" 70.8 849 1 4
"S2" 74.21 809 1 4
I would like to plot the mean of the "HR" column for each one of the values of the status column.
I wrote the following R code where I create a subset of the data (by different values of "Status") and plot it:
numberOfSeconds <- 8;
for(stimNumber in 1:40) {
stimulus2plot <- subset(resampledDataFile, Stimulus == stimNumber & Status <= numberOfSeconds, select=c(SubjID, HR, IBI, Stimulus, Status))
plot(stimulus2plot$HR~stimulus2plot$Status, xlab="",ylab="")
lines(stimulus2plot$HR~stimulus2plot$Status, xlab="",ylab="")
}
Thus obtaining a plot similar to the following:
I have one plot per each "Stimulus". On the X axis of each plot I have the "Status" column, on the Y I have one "HR" value for each "SubjID". Almost there...
However what I would like to obtain ultimately is a single Y datapoint per each X value. i.e. Y should be the mean value (mean of HR column), similar to the following plot:
How can this be achieved? It would be great having also the standard deviation shown as error bars in each datapoint.
Thanks in advance for your help.
Upvotes: 0
Views: 3627
Reputation: 762
You can do this completely within ggplot2 as follows, using the following fake data example as a guide:
DF <- data.frame(stimulus = factor(rep(paste("Stimulus", seq(4)), each = 40)),
subject = factor(rep(seq(20), each = 8)),
time = rep(seq(8), 20),
resp = rnorm(160, 50, 10))
# spaghetti plots
ggplot(DF, aes(x = time, y = resp, group = subject)) +
geom_line() +
facet_wrap(~ stimulus, ncol = 1)
# plot of time averages by stimulus
ggplot(DF, aes(x = time, y = resp)) +
stat_summary(fun.y = mean, geom = "line", group = 1) +
stat_summary(fun.y = mean, geom = "point", group = 1, shape = 1) +
facet_wrap(~ stimulus, ncol = 1)
Upvotes: 0
Reputation: 16036
To get it closest to what you want:
library(ggplot2)
library(plyr)
df.summary <- ddply(df, .(Stimulus, Status), summarise,
HR.mean = mean(HR),
HR.sd = sd(HR))
ggplot(df.summary, aes(Status, HR.mean)) + geom_path() + geom_point() +
geom_errorbar(aes(ymin=HR.mean-HR.sd, ymax=HR.mean+HR.sd), width=0.25) +facet_wrap(~Stimulus)
Upvotes: 2
Reputation: 60964
Easiest what you can do is first precompute the values, and then plot them. I would use ddply
for this kind of analysis:
library(plyr)
res = ddply(df, .(Status), summarise, mn = mean(HR))
and plot it using ggplot2:
ggplot(res, aes(x = Status, y = mn)) + geom_line() + geom_point()
Upvotes: 2
Reputation: 3965
The simplest way to do it would be tapply()
. If your data.frame
is data
:
means <- with(data, tapply(HR, Status, mean))
plot(means, type="l")
It is easy to calculate and plot the error bars as well:
serr <- with(data, tapply(HR, Status, function(x)sd(x)/sqrt(length(x))))
plot(means, type="o", ylim=c(50,80))
sapply(1:length(serr), function(i) lines(rep(i,2), c(means[i]+serr[i], means[i]-serr[i])))
Upvotes: 2