Andrew C. Kee
Andrew C. Kee

Reputation: 79

boxplot for only the outliers

Greeting

I would only like to plot the outliers for boxplot this is my solution but it does not seem to be very efficient or elegant. Any packages or better code for doing that. As you can see I am calling boxplot twice to do this So if my dataset is very big than it will be bad

Thanks

set.seed(1501)
y <- c(4, 0, 7, -5, rnorm(16))
x1 <- c("a", "a", "b", "b", sample(letters[1:5], 16, T))

lab_y <- sample(letters, 20)

datxx <- as.matrix(cbind(y,x1,lab_y))

boxplot_outlier<- function(dat){
bx <- boxplot(as.numeric(dat[,"y"]) ~ dat[,"x1"])

out_label <- c()
for ( i in seq(bx$out)){
out_label[i] <- dat[which(dat[,"y"]==bx$out[i]),"lab_y"]

}

out_label

out_g <- c()
for ( i in seq(bx$out)){
out_g[i] <- dat[which(dat[,"y"]==bx$out[i]),"x1"]

}

out_g


out_y <- c()
for ( i in seq(bx$out)){
out_y[i] <- dat[which(dat[,"y"]==bx$out[i]),"y"]

}

out_y

out_all<-cbind(out_y,out_g,out_label)
out_all <- as.matrix(out_all)

out_g <- as.matrix(out_g)

colnames(out_g)[1]<-"x1"

out_g_x <- out_g[which(!duplicated(out_g[,"x1"]))]

out_g_x <- as.matrix(out_g_x)

colnames(out_g_x)[1]<-"x1"

datsub <- merge(dat,out_g_x,by=c("x1"))

datsub <- as.matrix(datsub)

bx2 <- boxplot(as.numeric(datsub[,"y"]) ~ datsub[,"x1"],data=datsub)

mynum <- cbind(as.numeric(c(1:nrow(out_g_x))),out_g_x)
mynumxx <- merge(x=out_g,y=mynum,by=c("x1"))

colnames(mynumxx)[2]<-"v1"
text(as.numeric(mynumxx[,"v1"])+0.2,as.numeric(out_all[,"out_y"]),out_all[,"out_label"])


}

boxplot_outlier(datxx)

Upvotes: 1

Views: 102

Answers (1)

James
James

Reputation: 66874

You could use ggplot2 to plot and set the box and lines to a fully transparent colour. Note that you have to put your data into a data.frame for this, which is better anyway, since y is converted to character in a matrix with the other variables.

dat <- data.frame(y,x1,lab_y)

ggplot(as.data.frame(dat), aes(x=x1,y=y)) + geom_boxplot(fill="#00000000",colour="#00000000")

Upvotes: 2

Related Questions