Daniel Valencia C.
Daniel Valencia C.

Reputation: 2279

Quantile calculations using R and GraphPad Prism

I'm new in R. Before using R, I used GraphPad Prism 7.0. Só now I'm trying to compare both as data processors. I founded a difference in the quantile calculations, so anyone know why they are differents??

In R i have

par(pty="s", cex.axis=1, las=1, cex.lab=1)
a1=c(22.02, 23.83,  26.67,  25.38,  25.49,  23.50,  25.90,  24.89, 25)
a2=c(21.49, 22.67,  24.62,  24.18,  22.78,  22.56,  24.46,  23.79, 25)
a3=c(20.33, 21.67,  24.67,  22.45,  22.29,  21.95,  20.49,  21.81, 25)
boxplot(a1,a2,a3, names=c("a1","a2","a3"), ylab="Valor", ylim=c(20,28))

enter image description here

And the quantiles for a3 are

quantile(a3)
   0%   25%   50%   75%  100% 
20.33 21.67 21.95 22.45 25.00

Plotting the same data in GraphPad Prism:

Graph Family: Column Box & whiskers Plot: tukey

I get

enter image description here

And the quantiles are

enter image description here

Why they are differents (Particulary a3)??

Why R recognize 4 outliers in a3 and GraphPad does not?

Suggestions??

Upvotes: 1

Views: 887

Answers (2)

Roland
Roland

Reputation: 132706

Answering the question how to use different quantile calculations in a boxplot:

This is easy with ggplot2.

DF <- data.frame(a1, a2, a3)
DF <- stack(DF)

quants <- tapply(DF$values, list(DF$ind), quantile, type = 6)
quants <- as.data.frame(do.call(rbind, quants))
quants$g <- rownames(quants)

library(ggplot2)
ggplot(quants, aes(x = g, lower = `25%`, 
                   middle = `50%`, upper = `75%`,
                   ymin = `0%`, ymax = `100%`)) +
  geom_boxplot(stat = "identity")

resulting plot

You can then customize this plot further as explained in many ggplot2 tutorials.

PS: However, I would use R's default boxplot stats since these try to reproduce Tukey's boxplot.

Upvotes: 1

Daniel Valencia C.
Daniel Valencia C.

Reputation: 2279

As @lmo says, R has many ways to calculate quantiles. By default, R uses the type=7. GraphPad uses a method equivalent to type=6 in R. So the way I founded was

par(pty="s", cex.axis=1, las=1, cex.lab=1)
a1=c(22.02, 23.83,  26.67,  25.38,  25.49,  23.50,  25.90,  24.89, 25)
a2=c(21.49, 22.67,  24.62,  24.18,  22.78,  22.56,  24.46,  23.79, 25)
a3=c(20.33, 21.67,  24.67,  22.45,  22.29,  21.95,  20.49,  21.81, 25)
boxplot(
  quantile(a1,type=6),
  quantile(a2,type=6),
  quantile(a3,type=6), 
  names=c("a1","a2","a3"), ylab="Valor", ylim=c(20,28))

enter image description here

And

> quantile(a1,type=6)
    0%    25%    50%    75%   100% 
22.020 23.665 25.000 25.695 26.670 
> quantile(a2,type=6)
    0%    25%    50%    75%   100% 
21.490 22.615 23.790 24.540 25.000 
> quantile(a3,type=6)
   0%   25%   50%   75%  100% 
20.33 21.08 21.95 23.56 25.00

Same as GraphPad

Upvotes: 2

Related Questions