Andres
Andres

Reputation: 55

dotplot in R with lattice: display of vertical axis and error bars

I'm trying to do a dotplot with the libraries lattice and latticeExtra in R. However, no proper representation of the values on the vertical y-axis is done. Instead of choosing the actual values of the numeric variable, R plots the rank of the value. That is, there are values [375, 500, 625, 750, ..., 3000] and R plots their ranks [1,2,3,4,...23] and chooses the scale accordingly. Has someone experienced a problem like this? How can I manage the get a proper representation with ticks like (0, 500, 1000, 1500, ...) on the vertical y-scale?

Here the program code so far:

df.dose <- read.table("data.csv", sep=",", header=TRUE)
library(lattice); library(latticeExtra)

useOuterStrips(dotplot(z ~ sample.size | as.factor(effect.size)*as.factor(true.dose),
               groups=as.factor(type), data=df.dose, as.table=TRUE))

(Added from comment below): Also, can error bars be added to the graph? I thought of the following (to be added to the call), but it doesn't seem to work. Is it possible somehow?

up=z+se, lo=z-se, panel.groups=function(x,y,..., up, lo, subscripts){ 
   up <- up[subscripts]
   lo <- lo[subscripts]
   panel.segments(lo, as.numeric(y), up, as.numeric(y), ...)
}

Here's my data: https://www.dropbox.com/s/egy25cj00rhum40/data.csv

Added: here's the relevant portion of the data using expand.grid and dput:

df.dose <- expand.grid(effect.size=c(-.5, -.625, -0.75),
                       sample.size=c(40L, 60L, 80L),
                       true.dose=c(375L, 500L, 750L, 1125L),
                       type=c("dose", "categ", "FP2", "FP1"))
df.dose$z <- c(875L, 875L, 750L, 750L, 750L, 625L, 625L, 625L, 625L, 875L, 
875L, 750L, 1000L, 1000L, 1000L, 1125L, 1000L, 875L, 1000L, 1000L, 
875L, 1000L, 1000L, 875L, 1125L, 1000L, 1000L, 1250L, 1125L, 
1000L, 1250L, 1250L, 1125L, 1250L, 1000L, 1000L, 500L, 500L, 
500L, 500L, 500L, 500L, 500L, 500L, 500L, 625L, 625L, 625L, 625L, 
625L, 625L, 625L, 625L, 625L, 750L, 750L, 625L, 750L, 750L, 750L, 
750L, 750L, 750L, 875L, 875L, 750L, 750L, 875L, 875L, 875L, 875L, 
875L, 2500L, 1500L, 1125L, 2000L, 1000L, 1750L, 250L, 500L, 500L, 
1250L, 750L, 625L, 875L, 500L, 500L, 875L, 500L, 375L, 1250L, 
875L, 750L, 1000L, 625L, 625L, 875L, 500L, 500L, 1125L, 1000L, 
875L, 1125L, 875L, 625L, 1125L, 1000L, 625L, 2500L, 2125L, 2375L, 
2000L, 750L, 2625L, 250L, 625L, 250L, 875L, 875L, 500L, 625L, 
500L, 625L, 1000L, 500L, 375L, 1000L, 875L, 625L, 875L, 500L, 
500L, 875L, 500L, 500L, 1250L, 1125L, 875L, 1125L, 875L, 750L, 
1250L, 1000L, 625L)

Upvotes: 4

Views: 6821

Answers (3)

Geek On Acid
Geek On Acid

Reputation: 6410

You can also use xYplot from the package Hmisc, to achieve solution similar to @Aaron, although it might be a bit tricky to get the same jitter he got:

a <- xYplot(Cbind(z, z-se, z+se) ~ sample.size | as.factor(effect.size) * as.factor(true.dose),
            groups=as.factor(type), data=df.dose, as.table=TRUE, auto.key=list(space="top"))
useOuterStrips(a)

enter image description here

But is really informative plot? Does it show your data effects well, highlights your comparisons? Does it explore any trends in the data? To better see all the factors you want to plot, I would first make lines connections between your groups, to better see individual effects within different sample.size.

key.variety <- list(space = "top", 
                    text = list(levels(df.dose$type)),
                    points = list(pch = 0:3, col = "black"))
a <- xyplot(z ~ as.factor(sample.size) | as.factor(effect.size)*as.factor(true.dose),
            df.dose, type = "o", as.table=TRUE, groups = type, key = key.variety, 
            lty = 1, pch = 0:3, col.line = "darkgrey", col.symbol = "black")
useOuterStrips(a)

enter image description here

But there is something hiding there and there is still too much noise because of the density of data. Let's get rid of the effect.size and plot regression line, although it's probably a sin to do with so few data points.

a <- xyplot(z ~ as.factor(sample.size) | as.factor(type)*as.factor(true.dose), 
            data=df.dose, as.table=TRUE, 
            panel = function(x, y){
               panel.xyplot(x, y, jitter.x = T, col=1);
               panel.lmline(x, y, col=1, lwd=1.5);
            })
useOuterStrips(a)

enter image description here

I know I might have not convinced you, but sometimes it's better to unload a plot from too many factors to get better look at the data. Sometimes it might be more accessible visually if you show the factors separated.

Upvotes: 3

Aaron - mostly inactive
Aaron - mostly inactive

Reputation: 37784

You need to makez a factor: dotplot(factor(z) ~ ...

Also you probably want some jitter in the plot to prevent overlap; try adding jitter.x=TRUE or jitter.y=TRUE, or both.

Judging by your comment below and looking at the data again, I think you're plotting the dotplot the wrong way. I think you want the lines to be for the sample sizes, not for the z's. If you really want z on the vertical axis, you then need to add horizontal=TRUE. You could also swap what is on the horizontal and vertical axes.

useOuterStrips(dotplot(z ~ factor(sample.size) | 
                             as.factor(effect.size)*as.factor(true.dose),
                  groups=as.factor(type), data=df.dose,  
                  as.table=TRUE, horizontal=FALSE, jitter.x=TRUE))

To add an error bar, it's a little more complicated because you have groups within the panels, so you need to use a panel.groups function; additionally, so that the lines don't overlap, you probably want to jitter them from side to side a little, which is best done in a custom panel function.

df.dose$se <- 200
df.dose$type <- factor(df.dose$type)
df.dose$sample.size <- factor(df.dose$sample.size)

panel.groups.mydotplot <- function(x, y, subscripts, up, lo, 
                                   col=NA, col.line=NA, ...) {
  panel.points(x, y, ...)
  panel.segments(x, lo[subscripts], x, up[subscripts], col=col.line, ...)
}
panel.mydotplot <- function(x, y, subscripts, groups, ..., jitter=0.1) {
  jitter <- seq(-1,1,len=nlevels(groups))*jitter
  xx <- as.numeric(x) + jitter[as.numeric(groups[subscripts])]
  panel.dotplot(x, y, groups=groups, subscripts=subscripts, pch=NA, ...)
  panel.superpose(xx, y, groups=groups, subscripts=subscripts,  
                  panel.groups=panel.groups.mydotplot, ...)
}
pp <- dotplot(z ~ sample.size | as.factor(effect.size)*as.factor(true.dose),
              groups=type, data=df.dose, as.table=TRUE, horizontal=FALSE,
              up=df.dose$z + df.dose$se, lo=df.dose$z - df.dose$se,
              panel=panel.mydotplot, auto.key=list(space="right"))
useOuterStrips(pp)

enter image description here

Upvotes: 7

Tyler Rinker
Tyler Rinker

Reputation: 109954

I'm not sure if I understand the problem and you asked for a lattice solution but I thought it may be helpful to see this done with ggplot2:

ggplot(data=df.dose, aes(x=sample.size, y=as.factor(z), colour=type)) +
    geom_point() + facet_grid(true.dose~effect.size)

Yields: enter image description here

Or we can free the scales with:

ggplot(data=df.dose, aes(x=sample.size, y=as.factor(z), colour=type)) +
    geom_point() + facet_grid(true.dose~effect.size, scales="free")

Yielding:

enter image description here

Upvotes: 3

Related Questions