Reputation: 55
I'm trying to do a dotplot with the libraries lattice
and latticeExtra
in R
. However, no proper representation of the values on the vertical y-axis is done. Instead of choosing the actual values of the numeric variable, R
plots the rank of the value. That is, there are values [375, 500, 625, 750, ..., 3000]
and R
plots their ranks [1,2,3,4,...23]
and chooses the scale accordingly. Has someone experienced a problem like this? How can I manage the get a proper representation with ticks like (0, 500, 1000, 1500, ...)
on the vertical y-scale?
Here the program code so far:
df.dose <- read.table("data.csv", sep=",", header=TRUE)
library(lattice); library(latticeExtra)
useOuterStrips(dotplot(z ~ sample.size | as.factor(effect.size)*as.factor(true.dose),
groups=as.factor(type), data=df.dose, as.table=TRUE))
(Added from comment below): Also, can error bars be added to the graph? I thought of the following (to be added to the call), but it doesn't seem to work. Is it possible somehow?
up=z+se, lo=z-se, panel.groups=function(x,y,..., up, lo, subscripts){
up <- up[subscripts]
lo <- lo[subscripts]
panel.segments(lo, as.numeric(y), up, as.numeric(y), ...)
}
Here's my data: https://www.dropbox.com/s/egy25cj00rhum40/data.csv
Added: here's the relevant portion of the data using expand.grid
and dput
:
df.dose <- expand.grid(effect.size=c(-.5, -.625, -0.75),
sample.size=c(40L, 60L, 80L),
true.dose=c(375L, 500L, 750L, 1125L),
type=c("dose", "categ", "FP2", "FP1"))
df.dose$z <- c(875L, 875L, 750L, 750L, 750L, 625L, 625L, 625L, 625L, 875L,
875L, 750L, 1000L, 1000L, 1000L, 1125L, 1000L, 875L, 1000L, 1000L,
875L, 1000L, 1000L, 875L, 1125L, 1000L, 1000L, 1250L, 1125L,
1000L, 1250L, 1250L, 1125L, 1250L, 1000L, 1000L, 500L, 500L,
500L, 500L, 500L, 500L, 500L, 500L, 500L, 625L, 625L, 625L, 625L,
625L, 625L, 625L, 625L, 625L, 750L, 750L, 625L, 750L, 750L, 750L,
750L, 750L, 750L, 875L, 875L, 750L, 750L, 875L, 875L, 875L, 875L,
875L, 2500L, 1500L, 1125L, 2000L, 1000L, 1750L, 250L, 500L, 500L,
1250L, 750L, 625L, 875L, 500L, 500L, 875L, 500L, 375L, 1250L,
875L, 750L, 1000L, 625L, 625L, 875L, 500L, 500L, 1125L, 1000L,
875L, 1125L, 875L, 625L, 1125L, 1000L, 625L, 2500L, 2125L, 2375L,
2000L, 750L, 2625L, 250L, 625L, 250L, 875L, 875L, 500L, 625L,
500L, 625L, 1000L, 500L, 375L, 1000L, 875L, 625L, 875L, 500L,
500L, 875L, 500L, 500L, 1250L, 1125L, 875L, 1125L, 875L, 750L,
1250L, 1000L, 625L)
Upvotes: 4
Views: 6821
Reputation: 6410
You can also use xYplot
from the package Hmisc, to achieve solution similar to @Aaron, although it might be a bit tricky to get the same jitter he got:
a <- xYplot(Cbind(z, z-se, z+se) ~ sample.size | as.factor(effect.size) * as.factor(true.dose),
groups=as.factor(type), data=df.dose, as.table=TRUE, auto.key=list(space="top"))
useOuterStrips(a)
But is really informative plot? Does it show your data effects well, highlights your comparisons? Does it explore any trends in the data? To better see all the factors you want to plot, I would first make lines connections between your groups, to better see individual effects within different sample.size
.
key.variety <- list(space = "top",
text = list(levels(df.dose$type)),
points = list(pch = 0:3, col = "black"))
a <- xyplot(z ~ as.factor(sample.size) | as.factor(effect.size)*as.factor(true.dose),
df.dose, type = "o", as.table=TRUE, groups = type, key = key.variety,
lty = 1, pch = 0:3, col.line = "darkgrey", col.symbol = "black")
useOuterStrips(a)
But there is something hiding there and there is still too much noise because of the density of data. Let's get rid of the effect.size
and plot regression line, although it's probably a sin to do with so few data points.
a <- xyplot(z ~ as.factor(sample.size) | as.factor(type)*as.factor(true.dose),
data=df.dose, as.table=TRUE,
panel = function(x, y){
panel.xyplot(x, y, jitter.x = T, col=1);
panel.lmline(x, y, col=1, lwd=1.5);
})
useOuterStrips(a)
I know I might have not convinced you, but sometimes it's better to unload a plot from too many factors to get better look at the data. Sometimes it might be more accessible visually if you show the factors separated.
Upvotes: 3
Reputation: 37784
You need to makez
a factor: dotplot(factor(z) ~ ...
Also you probably want some jitter in the plot to prevent overlap; try adding jitter.x=TRUE
or jitter.y=TRUE
, or both.
Judging by your comment below and looking at the data again, I think you're plotting the dotplot the wrong way. I think you want the lines to be for the sample sizes, not for the z
's. If you really want z
on the vertical axis, you then need to add horizontal=TRUE
. You could also swap what is on the horizontal and vertical axes.
useOuterStrips(dotplot(z ~ factor(sample.size) |
as.factor(effect.size)*as.factor(true.dose),
groups=as.factor(type), data=df.dose,
as.table=TRUE, horizontal=FALSE, jitter.x=TRUE))
To add an error bar, it's a little more complicated because you have groups within the panels, so you need to use a panel.groups
function; additionally, so that the lines don't overlap, you probably want to jitter them from side to side a little, which is best done in a custom panel
function.
df.dose$se <- 200
df.dose$type <- factor(df.dose$type)
df.dose$sample.size <- factor(df.dose$sample.size)
panel.groups.mydotplot <- function(x, y, subscripts, up, lo,
col=NA, col.line=NA, ...) {
panel.points(x, y, ...)
panel.segments(x, lo[subscripts], x, up[subscripts], col=col.line, ...)
}
panel.mydotplot <- function(x, y, subscripts, groups, ..., jitter=0.1) {
jitter <- seq(-1,1,len=nlevels(groups))*jitter
xx <- as.numeric(x) + jitter[as.numeric(groups[subscripts])]
panel.dotplot(x, y, groups=groups, subscripts=subscripts, pch=NA, ...)
panel.superpose(xx, y, groups=groups, subscripts=subscripts,
panel.groups=panel.groups.mydotplot, ...)
}
pp <- dotplot(z ~ sample.size | as.factor(effect.size)*as.factor(true.dose),
groups=type, data=df.dose, as.table=TRUE, horizontal=FALSE,
up=df.dose$z + df.dose$se, lo=df.dose$z - df.dose$se,
panel=panel.mydotplot, auto.key=list(space="right"))
useOuterStrips(pp)
Upvotes: 7
Reputation: 109954
I'm not sure if I understand the problem and you asked for a lattice solution but I thought it may be helpful to see this done with ggplot2:
ggplot(data=df.dose, aes(x=sample.size, y=as.factor(z), colour=type)) +
geom_point() + facet_grid(true.dose~effect.size)
Yields:
Or we can free the scales with:
ggplot(data=df.dose, aes(x=sample.size, y=as.factor(z), colour=type)) +
geom_point() + facet_grid(true.dose~effect.size, scales="free")
Yielding:
Upvotes: 3