Reputation: 169
I have generated box plot for large dataset where I am showing impact of genotype on splicing ratios, as a result, I got box plot with many outliers due to which the size of box plots squeezed, I can ignore outliers using (outlier.colour = NA) but when I try to reset ylim using scale_y_continuous(limits = c(lower, upper)), it changes the whole dynamics, Can someone please help me in changing height of boxplots so that I can see the change clearly.
This post is relevant but I wasn't able to fix this problem.
Ignore outliers in ggplot2 boxplot
I have used this code to plot:
Trans <- read.delim("EXAMPLES/AT1G04170_SR_2", header=TRUE,
sep="\t")
Trans_1 <-
ggplot(data=Trans,mapping=aes(x=Genotype,y=Ratio,fill=Isoforms))
+geom_boxplot(outlier.colour = NA)
Data
sample Isoforms Ratio Genotype
108 AT1G04170_JC4 0.114555061397559 CC
139 AT1G04170_JC4 1.43188141139633E-07 CC
159 AT1G04170_JC4 0.974829214147311 CT
108 AT1G04170_P1 0.885444938602441 CC
139 AT1G04170_P1 0.980915433730349 CC
159 AT1G04170_P1 0.025170785852689 CT
108 AT1G04170_P2 0 CC
139 AT1G04170_P2 0 CC
159 AT1G04170_P2 0 CT
108 AT1G04170_c1 0 CC
139 AT1G04170_c1 0.01908442308151 CC
159 AT1G04170_c1 0 CT
I want the boxplots inside the ggplot to be less squeezed so that I can see the colors and properly.
current image: https://ibb.co/S3gS2KR
Upvotes: 0
Views: 2657
Reputation: 502
First - Not sure why you're seeing so many outliers. When I run your code, I see none.
Second - It's not an outlier issue but rather a scaling issue. That is, your sample variability is small compared to your y minimum and maximum. You can make the graph bigger. If you're using RStudio, you can do this is in the code chunk header as such:
```{r, fig.height=8}
ggplot(data=Trans,mapping=aes(x=Genotype,y=Ratio,fill=Isoforms)) +
geom_boxplot(outlier.colour = NA)
```
Third - You won't be able to make the 5 boxes on the right bigger because all of those values are the same, ie all values = 0.
EDIT: Looking at your data closer, those values aren't all 0 but goes back to the underlying issue, the values are so close together compared to your y-min and y-max.
Upvotes: 0