Adrian
Adrian

Reputation: 9793

R: How to visualize large and clumped scatter plot

 status = sample(c(0, 1), 500, replace = TRUE)
 value = rnorm(500)

 plot(value)
 smoothScatter(value)

I'm trying to make a scatterplot of value, but if I were to just plot it, the data is all clumped together and it's not very presentable. I've tried smoothScatter(), which makes the plot look a bit nicer, but I am wondering if there's a way to color code the values based on the corresponding status?

I am trying to see if there's a relationship between status and value. What's another way to present the data nicely? I've tried boxplot, but I'm wondering how I can make the smoothScatter() plot better or if there are other ways to visualize it.

Upvotes: 1

Views: 874

Answers (1)

JasonAizkalns
JasonAizkalns

Reputation: 20463

I'm assuming you meant to write plot(status, value) in your example? Regardless, there's not going to be much difference using this data, but you should get the idea of things to maybe look at with the following examples...

Have you looked into jitter?

Some basics:

plot(jitter(status), value)

Simple Jitter

or perhaps plot(jitter(status, 0.5), value)

Tighter Jitter Plot

Fancier with package ggplot2 you could do:

library(ggplot2)
df <- data.frame(value, status)
ggplot(data=df, aes(jitter(status, 0.10), value)) + 
  geom_point(alpha = 0.5)

ggplot 01

or this...

ggplot(data=df, aes(factor(status), value)) +
  geom_violin()

ggplot 02

or...

ggplot(data=df, aes(x=status, y=value)) +
  geom_density2d() + 
  scale_x_continuous(limits=c(-1,2))

ggplot 03

or...

ggplot(data=df, aes(x=status, y=value)) +
  geom_density2d() +
  stat_density2d(geom="tile", aes(fill = ..density..), contour=FALSE) +
  scale_x_continuous(limits=c(-1,2))

ggplot 04

or even this..

ggplot(data=df, aes(fill=factor(status), value)) +
  geom_density(alpha=0.2)

ggplot 05

Upvotes: 1

Related Questions