Blundering Ecologist
Blundering Ecologist

Reputation: 1315

Connecting points within nested group ggplot2

I am trying to connect multiple points to a single point, grouped by a single variable. My question is similar to this OP, but they have multiple points, not a single one.

Here is a dataframe to illustrate the type of data I am working with:

A <- data.frame(
    Stage = c("Juvenile", "Juvenile", "Yearling", 
            "Juvenile", "Juvenile", "Yearling",
            "Juvenile", "Juvenile", "Yearling",
            "Juvenile", "Juvenile", "Yearling"),
    Individual = c ("A", "A", "A",
            "B", "B", "B",
            "C", "C", "C",
            "D", "D", "D"),
    Score = c(  1.4, 1.2, NA,
            0.4, 0.6, 0.5,
            -0.3, -0.5, -0.4,
            -1.4, -1.2, NA))

The closest graph I have been able to get is with this code (only showing barebones code for simplicity):

ggplot(A, aes(x = Stage, y = Score, color =Individual, group= Individual)) + 
 geom_point() + 
 geom_line(aes(group=Individual)+
 geom_smooth(aes(x = Stage), 
             method=lm, se=F, fullrange=TRUE, color="black")

enter image description here

I instead need something more like this (hand drawn):

enter image description here

How do I:

  1. Only connect the points in the Juvenile column with the single point in the Yearling column (when there is a point there)?
  2. Not connect the points within the Juvenile column to each other within Individual?

Upvotes: 0

Views: 102

Answers (2)

jay.sf
jay.sf

Reputation: 72848

Here comes a base R solution using points, arrows, and lines. For the fit line we need an if () to handle the exception with NA cases. All done in a sapply() for each individual.

plot(1:3, xaxt="n", xlab="Stage", xlim=c(.5, 2.5),
     ylab="Score", yaxt="n",
     ylim=c(-1.5, 1.5), type="n")
sapply(unique(A$individual), function(x) {
  points(A$stage[A$individual == x], 
         A$score[A$individual == x], col=x, pch=16)
  arrows(1, A$score[A$individual == x & A$stage == "Juvenile"], 
         2, A$score[A$individual == x & A$stage == "Yearling"], col=x, code=0)
  if (!any(is.na(A$score[A$individual == x]))) {
    fit <- lm(score ~ as.numeric(stage), A[A$individual == x, ])
    X <- c(1, 2)
    Yhat <- predict(fit, newdata=data.frame(stage=X))
    lines(X, Yhat, col=x)
  }
  })
axis(1, 1:2, unique(A$stage))
axis(2, (-2):1)
legend("bottomright", legend=levels(A$individual), lty=1, col=1:4)

Produces

enter image description here

Data

A <- structure(list(stage = structure(c(1L, 1L, 2L, 1L, 1L, 2L, 1L, 
1L, 2L, 1L, 1L, 2L), .Label = c("Juvenile", "Yearling"), class = "factor"), 
    individual = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 
    3L, 4L, 4L, 4L), .Label = c("A", "B", "C", "D"), class = "factor"), 
    score = c(1.4, 1.2, NA, 0.4, 0.6, 0.5, -0.3, -0.5, -0.4, 
    -1.4, -1.2, NA)), class = "data.frame", row.names = c(NA, 
-12L))

Upvotes: 0

Jon Spring
Jon Spring

Reputation: 66490

Here's an approach using a separate prepared table for the connections:

A_connections <- A %>% 
  filter(Stage == "Juvenile") %>%
  left_join(A %>% filter(Stage == "Yearling") %>% select(Individual, Y_Score = Score))

ggplot(A, aes(x = Stage, y = Score, color = Individual, group= Individual)) + 
  geom_point() + 
  geom_segment(data = A_connections, aes(xend = "Yearling", yend = Y_Score)) +
  geom_smooth(method=lm, se=F, fullrange=TRUE)

enter image description here

Upvotes: 1

Related Questions