Roe
Roe

Reputation: 23

Is there a way to use R to break chart axis and break linear regression line?

I'm trying to figure out how to modify a scatter-plot that contains two groups of data along a continuum separated by a large gap. The graph needs a break on the x-axis as well as on the regression line.

This R code using the ggplot2 library accurately presents the data, but is unsightly due to the vast amount of empty space on the graph. Pearson's correlation is -0.1380438.

library(ggplot2)
p <- ggplot(, aes(x = dis, y = result[, 1])) + geom_point(shape = 1) +
  xlab("X-axis") +
  ylab("Y-axis") + geom_smooth(color = "red", method = "lm", se = F) + theme_classic()
p + theme(plot.title = element_text(hjust = 0.5, size = 14))

enter image description here

This R code uses gap.plot to produce the breaks needed, but the regression line doesn't contain a break and doesn't reflect the slope properly. As you can see, the slope of the regression line isn't as sharp as the graph above and there needs to be a visible distinction in the slope of the line between those disparate groups.

library(plotrix)
gap.plot(
  x = dis,
  y = result[, 1],
  gap = c(700, 4700),
  gap.axis = "x",
  xlab = "X-Axis",
  ylab = "Y-Axis",
  xtics = seq(0, 5575, by = 200)
)
abline(v = seq(700, 733) , col = "white") 
abline(lm(result[, 1] ~ dis), col = "red", lwd = 2)
axis.break(1, 716, style = "slash")               

enter image description here

Using MS Paint, I created an approximation of what the graph should look like. Notice the break marks on the top as well as the discontinuity between on the regression line between the two groups.

enter image description here

Upvotes: 2

Views: 652

Answers (1)

A. S. K.
A. S. K.

Reputation: 2816

One solution is to plot the regression line in two pieces, using ablineclip to limit what's plotted each time. (Similar to @tung's suggestion, although it's clear that you want the appearance of a single graph rather than the appearance of facets.) Here's how that would work:

library(plotrix)

# Simulate some data that looks roughly like the original graph.
dis = c(rnorm(100, 300, 50), rnorm(100, 5000, 100))
result = c(rnorm(100, 0.6, 0.1), rnorm(100, 0.5, 0.1))

# Store the location of the gap so we can refer to it later.
x.axis.gap = c(700, 4700)

# gap.plot() works internally by shifting the location of the points to be
# plotted based on the gap size/location, and then adjusting the axis labels
# accordingly.  We'll re-compute the second half of the regression line in the
# same way; these are the new values for the x-axis.
dis.alt = dis - x.axis.gap[1]

# Plot (same as before).
gap.plot(
  x = dis,
  y = result,
  gap = x.axis.gap,
  gap.axis = "x",
  xlab = "X-Axis",
  ylab = "Y-Axis",
  xtics = seq(0, 5575, by = 200)
)
abline(v = seq(700, 733), col = "white")
axis.break(1, 716, style = "slash")

# Add regression line in two pieces: from 0 to the start of the gap, and from
# the end of the gap to infinity.
ablineclip(lm(result ~ dis), col = "red", lwd = 2, x2 = x.axis.gap[1])
ablineclip(lm(result ~ dis.alt), col = "red", lwd = 2, x1 = x.axis.gap[1] + 33)

enter image description here

Upvotes: 3

Related Questions