Repeat loop 100 times adding regression lines to same plot for each iteration

Question

I have a while loop that requires a sampling process be repeated until values of the sample column value are less than columns rep1:rep4. I would like to repeat this loop a set number of times, let's say 100.

For each successful loop, I would like to add regression lines to two separate plots. In this example, the value column would be providing data for the x axis, and data for the y axis would be coming from y1 and y2. I've included the additional y column because there might be a number of variables that I would like to plot that would all be sharing the same x axis data. For this example, the end result would be two plots, one for y1 and one for y2, each containing 100 overlapping regression lines.

I haven't included the sampling process code here because it is a bit overly involved and would probably distract from the main question here.

The basic while loop and sample data are provided below.

This thread Use a for-loop of characters to plot several lines with specific colors indicates that an additional for loop with seq_along might be the answer here. Different colors aren't an issue for me however, so that example might be more complicated than what is needed here.

for (i in 1:nrow(df)){
  while (any(df$value[i]<=as.numeric(df[i,2:5])%>%na.omit())){

#sampling procedure here

}
}

Here is an example of the df layout:


    ID    rep1   rep2   rep3   rep4  y1  y2  value
1   a     NA     NA     NA     NA    5   2   -400
2   b     -400   NA     NA     NA    7   5   -300
3   c     -400   -300   NA     NA    3   3   -200
4   d     -400   -300   -200   NA    4   6   -300
5   e     -400   -300   -200   -300  9   7   -400
6   f     NA     NA     NA     NA    2   3   -400
7   g     -400   NA     NA     NA    3   2   -400
8   h     NA     NA     NA     NA    6   4   -400
9   i     NA     NA     NA     NA    7   4   -200
10  j     -200   -300   NA     NA    7   6   -300
11  k     -300   NA     NA     NA    8   9   -200
12  l     NA     NA     NA     NA    3   7   -300
13  m     NA     NA     NA     NA    4   7   -300

structure(list(ID = structure(1:13, .Label = c("a", "b", "c", 
"d", "e", "f", "g", "h", "i", "j", "k", "l", "m"), class = "factor"), 
    rep1 = c(NA, -400L, -400L, -400L, -400L, NA, -400L, NA, NA, 
    -200L, -300L, NA, NA), rep2 = c(NA, NA, -300L, -300L, -300L, 
    NA, NA, NA, NA, -300L, NA, NA, NA), rep3 = c(NA, NA, NA, 
    -200L, -200L, NA, NA, NA, NA, NA, NA, NA, NA), rep4 = c(NA, 
    NA, NA, NA, -300L, NA, NA, NA, NA, NA, NA, NA, NA), y1 = c(5L, 
    7L, 3L, 4L, 9L, 2L, 3L, 6L, 7L, 7L, 8L, 3L, 4L), y2 = c(2L, 
    5L, 3L, 6L, 7L, 3L, 2L, 4L, 4L, 6L, 9L, 7L, 7L), value = c(-400L, 
    -300L, -200L, -300L, -400L, -400L, -400L, -400L, -200L, -300L, 
    -200L, -300L, -300L)), class = "data.frame", row.names = c(NA, 
-13L))

I'd imagine this should work for the basic plots

ggplot(data = df, aes(x = value, y = y1)) +
  geom_smooth(method = lm, se = FALSE)
ggplot(data = df, aes(x = value, y = y2)) +
  geom_smooth(method = lm, se = FALSE)

Peter_Evan · Accepted Answer

Let's assume that your sampling procedure works as intended, and that the goal is merely to track the data frames from the while loop and plot this using geom_smooth (please clarify if I am misunderstanding). You could just save the variables of interest to a data frame, include an ID to track each data frame, and then group by these IDs when plotting. Below I am using the data you provided.

library(tidyverse)
set.seed(4)

#an empty data frame to save our output
toplot <- data.frame(ID = NA, value = NA, y1 = NA, y2 = NA)

#loop for 50 times for this example
for(i in 1:50){

  #sampling the df. This would be your while loop. 
  #Just save your output from the while loop as an data frame object
  d1 <- sample_n(df, 5)

  #save the values of interest
  toplot_TMP <- data.frame(value = d1$value, y1 = d1$y1, y2 = d1$y2)

  #create ID variable
  toplot_TMP$ID <- i

  #bind to our data frame for later
  toplot <- bind_rows(toplot,toplot_TMP)

 }

#drop NA artifact
toplot <- na.omit(toplot)

#plotting with group = ID
ggplot(data = toplot, aes(x = value, y = y1, group = ID)) +
  geom_smooth(method = lm, se = FALSE)

Gussy up as you need to.

Repeat loop 100 times adding regression lines to same plot for each iteration

Answers (1)

Related Questions