ashids
ashids

Reputation: 1

Data point missing/overlapping in geom_point ggplot2

I have around 900 data point in my dataset, but after I plot, the data plot in the plot looks like less than 100, is that because of overlapping? or some other reason, I am not sure.

This is my plot:

Plot/data point seems less than hunder

my code

ggplot(data, aes(x = as.numeric(`x1`), y=`x2`, color=`x3`)) +
  geom_point() +
  scale_x_continuous(breaks = seq(0,135,15))

Upvotes: 0

Views: 722

Answers (1)

r2evans
r2evans

Reputation: 160792

Two techniques to deal with overlapping/coincident data:

  1. Jitter the points so that most overlapping points shift a little;
  2. Apply alpha to the color so that darker points indicate more-frequent data.

Data

set.seed(42)
dat <- data.frame(
  x = round(rnorm(100), 0),
  y = round(rnorm(100), 0)
)
head(dat)
#    x  y
# 1  1  1
# 2 -1  1
# 3  0 -1
# 4  1  2
# 5  0 -1
# 6  0  0

xtabs(~ x + y, data=dat)
#     y
# x    -2 -1  0  1  2  3
#   -3  0  1  0  0  0  1
#   -2  1  3  1  0  0  0
#   -1  1  1 11  7  1  0
#   0   1 13 13  8  1  0
#   1   2  6 17  4  1  0
#   2   0  0  5  1  0  0

The problem

library(ggplot2)
ggplot(dat, aes(x, y)) + geom_point()

ggplot problem overlaps

Transparency (alpha)

ggplot(dat, aes(x, y)) +
  geom_point(color = "#00000022")

ggplot with alpha

Jittering

ggplot(dat, aes(x, y)) +
  geom_point() +
  geom_jitter()

ggplot, too much jitter

That might be too much, so we can adjust how much things shift around.

ggplot(dat, aes(x, y)) +
  geom_point() +
  geom_jitter(width = 0.1, height = 0.1)

ggplot, less jitter

Both alpha and jitter

Not strictly required here, but it might be useful to do both:

ggplot(dat, aes(x, y)) +
  geom_point(color = "#00000022") +
  geom_jitter(width = 0.1, height = 0.1)

ggplot, both jitter and alpha

Upvotes: 1

Related Questions