Reputation: 409
Hello R and gglplot pros,
I am trying to create a graphics with R and ggplot2, which contains a dot plot, a box plot and a line plot. I attach an example, how the final result should look like:
I created a data frame (data) with data of four time points (in long format) and an additional ID variable:
ID time value
1 T1 11
1 T1 22
1 T2 11
2 T1 22
2 T2 22
2 T2 44
...
I am able to create both box and dot plots and differentiate the groups with the individual color. My ggplot2 script looks currently like that:
ggplot(data=data, aes(x=factor(time), y=value, fill=ID)) +
geom_boxplot(fill="white") +
geom_dotplot(binaxis="y", stackdir="center")
How can I connect the dots in T1 with the corresponding dots in T2, T3 T4?
Thanks in advance for any advise!
All the best, Chris
Upvotes: 1
Views: 3370
Reputation:
You can try as follows
b <- runif(nrow(df), -0.1, 0.1)
ggplot(df) +
geom_boxplot(aes(x = as.numeric(condition), y = pain, group = condition))+
geom_point(aes(x = as.numeric(condition) + b, y = pain)) +
geom_line(aes(x = as.numeric(condition) + b, y = pain, group = ID)) +
scale_x_continuous(breaks = c(1,2), labels = c("No Treatment", "Treatment"))+
xlab("condition")
or wihout jitter
ggplot(df, aes(condition, pain)) +
geom_boxplot(width=0.3, size=1.5, fatten=1.5, colour="grey70") +
geom_point(colour="red", size=2, alpha=0.5) +
geom_line(aes(group=ID), colour="red", linetype="11") +
theme_classic()
Upvotes: 0
Reputation: 294
Let me create some sample data based on my understanding of the question:
n <- 7
dfA <- data.frame(group='A',tie=LETTERS[1:n], val=rnorm(n)*27.1)
dfB <- dfA
dfB$group <- 'B'
dfB$val <- dfA$val + rnorm(n)*3.14
this creates two datasets, dfA and dfB. It is assumed that dfA and dfB are linked via [tie].
> head(dfA)
group tie val
1 A A -9.835
2 A B 35.575
3 A C 13.117
4 A D 18.802
5 A E -29.504
6 A F -56.461
> head(dfB)
group tie val
1 B A -12.62
2 B B 32.43
3 B C 7.83
4 B D 17.27
5 B E -28.56
6 B F -59.93
The trick is now to create two versions of the data, one long and one wide (actually it doesnt matter if you start with the long data, as shown here, or with the wide - you just have to have both in the end)
dflong <- rbind(dfA, dfB)
dfwide <- merge(dfA, dfB, by='tie')
That gives us:
> head(dflong, 10)
group tie val
1 A A -9.835
2 A B 35.575
3 A C 13.117
4 A D 18.802
5 A E -29.504
6 A F -56.461
7 A G 44.464
8 B A -12.625
9 B B 32.430
10 B C 7.830
> head(dfwide)
tie group.x val.x group.y val.y
1 A A -9.835 B -12.62
2 B A 35.575 B 32.43
3 C A 13.117 B 7.83
4 D A 18.802 B 17.27
5 E A -29.504 B -28.56
6 F A -56.461 B -59.93
Now ggplot the two different versions of the same data together:
ggplot(data=dflong, aes(x=group, y=val, color=group)) +
theme_bw() +
geom_boxplot() +
geom_segment(data=dfwide, aes(x=group.x, xend=group.y, y=val.x, yend=val.y, color='A'), color='grey') +
geom_text(aes(label=tie))
Thats yields the desired plot
NB: the color='A' in the aes() of the geom_segment() is a mandatory hack to sneak in the wide second dataset into the regime set by the long set. Its re-set with the color=grey setting outside the aes().
In case of there is no 1:1 relationship between dfA and dfB that would need to be handled seperatley.
Upvotes: 1