user248237
user248237

Reputation:

scatter plot of same variable across different conditions with ggplot facet_grid?

I'd like to correlate the same column of a dataframe for points with distinct row values. For example, in the iris dataframe, I'd like to make three scatter plots comparing Petal.Length of virginica with that of versicolor, setosa with virginica and versicolor with setosa. I want it to appear just like a normal facet_grid or facet_wrap plot. For example, I can do:

ggplot(iris) + geom_point(aes(x=Petal.Length, y=Petal.Length)) + facet_grid(~Species)

This is not what I want, since it's plotting Petal.Length of each species against itself, but I want the plot to appear like this, except where I handcode which species to compare to what other species. How can this be done in ggplot? Thanks.

Upvotes: 3

Views: 9530

Answers (2)

bdemarest
bdemarest

Reputation: 14667

Your question seems to be about comparing a single variable measured on many individuals that fall into multiple categories. Given your example using the iris dataset, a scatterplot is probably not a useful visualization.

Here I offer several univariate visualizations available in ggplot2. I hope one of these is helpful:

library(ggplot2)

plot_1 = ggplot(iris, aes(x=Petal.Length, colour=Species)) +
         geom_density() +
         labs(title="Density plots")

plot_2 = ggplot(iris, aes(x=Petal.Length, fill=Species)) +
         geom_histogram(colour="grey30", binwidth=0.15) +
         facet_grid(Species ~ .) +
         labs(title="Histograms")

plot_3 = ggplot(iris, aes(y=Petal.Length, x=Species)) +
         geom_point(aes(colour=Species),
                    position=position_jitter(width=0.05, height=0.05)) +
         geom_boxplot(fill=NA, outlier.colour=NA) +
         labs(title="Boxplots")

plot_4 = ggplot(iris, aes(y=Petal.Length, x=Species, fill=Species)) +
         geom_dotplot(binaxis="y", stackdir="center", binwidth=0.15) +
         labs(title="Dot plots")

library(gridExtra)
part_1 = arrangeGrob(plot_1, plot_2, heights=c(0.4, 0.6))
part_2 = arrangeGrob(plot_3, plot_4, nrow=2)
parts_12 = arrangeGrob(part_1, part_2, ncol=2, widths=c(0.6, 0.4))
ggsave(file="plots.png", parts_12, height=6, width=10, units="in")

enter image description here

Upvotes: 12

Arun
Arun

Reputation: 118799

It is better to group the data first. I'd do something like this:

# get Petal.Length for each species separately    
df1 <- subset(iris, Species == "virginica", select=c(Petal.Length, Species))
df2 <- subset(iris, Species == "versicolor", select=c(Petal.Length, Species))
df3 <- subset(iris, Species == "setosa", select=c(Petal.Length, Species))

# construct species 1 vs 2, 2  vs 3 and 3 vs 1 data
df <- data.frame(x=c(df1$Petal.Length, df2$Petal.Length, df3$Petal.Length), 
y = c(df2$Petal.Length, df3$Petal.Length, df1$Petal.Length), 
grp = rep(c("virginica.versicolor", "versicolor.setosa", "setosa.virginica"), each=50))
df$grp <- factor(df$grp)

# plot
require(ggplot2)
ggplot(data = df, aes(x = x, y = y)) + geom_point(aes(colour=grp)) + facet_wrap( ~ grp)

This results in:

enter image description here

Upvotes: 5

Related Questions