Reputation: 809
I am trying to make a multilines plot by using ggplot
. I also want to plot over the lines some points (which mean the significance). The problem is that when I use geom_point
, these points do not coincide with the line that I want. These would be the data:
dat1:
1-4 2-5 3-6 4-7 5-8 6-9 7-10 id
mod1 -0.035930 0.121970 0.34689 0.034345 0.35312 0.52048 0.58536 mod1
mod2 -0.094121 0.297150 0.37262 0.512140 0.63918 0.42127 0.73890 mod2
mod3 0.810550 0.876070 0.57120 0.472640 0.67341 0.79332 0.80882 mod3
mod4 -0.121970 0.010009 0.49783 0.920100 0.76192 0.45662 0.45526 mod4
dat2:
1-4 2-5 3-6 4-7 5-8 6-9 7-10 id
mod1 NaN NaN NaN NaN NaN NaN NaN mod1
mod2 NaN NaN NaN NaN 0.63918 NaN NaN mod2
mod3 0.81055 0.87607 NaN NaN 0.67341 0.79332 0.80882 mod3
mod5 NaN NaN NaN 0.9201 0.76192 NaN NaN mod4
The plot will contain 4 lines and in the model with some values, I want to plot them with a point.
This is my try:
#Start Plotting
library(ggplot2)
library(reshape2)
dat_r$id <- nam_model #names of models
dat_r1$id <- nam_model
df <- melt(dat_r,id='id')
df2 <-melt(dat_r1,id='id')
p <-ggplot(df, aes(x=variable,y=value, group=id)) +
geom_line(aes(color=id), lwd=1) + geom_point(aes(x=df2$variable,y=df2$value, group=df$id),size = 4)
Any suggestion?? I appreciate any idea!
Thanks in advance
Upvotes: 2
Views: 430
Reputation: 67828
Here's one possibility:
Read data, using check.names = FALSE
because your variable names are not syntactically valid.
dat1 <- read.table(text = " 1-4 2-5 3-6 4-7 5-8 6-9 7-10 id
mod1 -0.035930 0.121970 0.34689 0.034345 0.35312 0.52048 0.58536 mod1
mod2 -0.094121 0.297150 0.37262 0.512140 0.63918 0.42127 0.73890 mod2
mod3 0.810550 0.876070 0.57120 0.472640 0.67341 0.79332 0.80882 mod3
mod4 -0.121970 0.010009 0.49783 0.920100 0.76192 0.45662 0.45526 mod4",
header = TRUE, check.names = FALSE)
dat2 <- read.table(text = " 1-4 2-5 3-6 4-7 5-8 6-9 7-10 id
mod1 NaN NaN NaN NaN NaN NaN NaN mod1
mod2 NaN NaN NaN NaN 0.63918 NaN NaN mod2
mod3 0.81055 0.87607 NaN NaN 0.67341 0.79332 0.80882 mod3
mod5 NaN NaN NaN 0.9201 0.76192 NaN NaN mod4",
header = TRUE, check.names = FALSE)
melt
data to long format:
library(reshape2)
dat1m <- melt(dat1, id.var = "id")
dat2m <- melt(dat2, id.var = "id")
Plotting with data set for the lines, and another for the points:
library(ggplot2)
ggplot(data = df1m, aes(x = variable, y = value, colour = id, group = id)) +
geom_line() +
geom_point(data = df2m, size = 4)
A small note: in your aes
call, avoid code like dataset$variable
(e.g. df2$value
). It may cause unwanted behaviour.
Upvotes: 1
Reputation: 83275
First read the data:
dat1 <- read.table(header=TRUE, check.names=FALSE, text="1-4 2-5 3-6 4-7 5-8 6-9 7-10 id
-0.035930 0.121970 0.34689 0.034345 0.35312 0.52048 0.58536 mod1
-0.094121 0.297150 0.37262 0.512140 0.63918 0.42127 0.73890 mod2
0.810550 0.876070 0.57120 0.472640 0.67341 0.79332 0.80882 mod3
-0.121970 0.010009 0.49783 0.920100 0.76192 0.45662 0.45526 mod4")
dat2 <- read.table(header=TRUE, check.names=FALSE, text="1-4 2-5 3-6 4-7 5-8 6-9 7-10 id
NaN NaN NaN NaN NaN NaN NaN mod1
NaN NaN NaN NaN 0.63918 NaN NaN mod2
0.81055 0.87607 NaN NaN 0.67341 0.79332 0.80882 mod3
NaN NaN NaN 0.9201 0.76192 NaN NaN mod4")
Than transform the data into long format with the reshape2
package:
library(reshape2)
df1 <- melt(dat1, id="id")
df2 <- melt(dat2, id="id")
You can also use a combination of the dplyr
and tidyr
packages:
library(dplyr)
library(tidyr)
df1 <- dat1 %>% gather(var, value, 1:7)
df2 <- dat2 %>% gather(var, value, 1:7)
Binding the data together in one dataframe (which is not necessarily):
dat <- cbind(df1,df2[,3])
names(dat) <- c("id","var","value1","value2")
Finally create the plot:
ggplot(data=dat, aes(x=var, y=value1, color=id, group=id)) +
geom_line(lwd=1) +
geom_point(aes(y=value2), size=4) +
scale_x_discrete("\nModels") +
scale_y_continuous("Value", breaks=c(0,0.2,0.4,0.6,0.8)) +
theme_bw()
which gives:
When you don't want to bind the data together in one dataframe, you can use:
ggplot(data=df1, aes(x=var, y=value, color=id, group=id)) +
geom_line(lwd=1) +
geom_point(data=df2, size=4) +
scale_x_discrete("\nModels") +
scale_y_continuous("Value", breaks=c(0,0.2,0.4,0.6,0.8)) +
theme_bw()
Upvotes: 2