Corey
Corey

Reputation: 435

Repeat loop 100 times adding regression lines to same plot for each iteration

I have a while loop that requires a sampling process be repeated until values of the sample column value are less than columns rep1:rep4. I would like to repeat this loop a set number of times, let's say 100.

For each successful loop, I would like to add regression lines to two separate plots. In this example, the value column would be providing data for the x axis, and data for the y axis would be coming from y1 and y2. I've included the additional y column because there might be a number of variables that I would like to plot that would all be sharing the same x axis data. For this example, the end result would be two plots, one for y1 and one for y2, each containing 100 overlapping regression lines.

I haven't included the sampling process code here because it is a bit overly involved and would probably distract from the main question here.

The basic while loop and sample data are provided below.

This thread Use a for-loop of characters to plot several lines with specific colors indicates that an additional for loop with seq_along might be the answer here. Different colors aren't an issue for me however, so that example might be more complicated than what is needed here.

for (i in 1:nrow(df)){
  while (any(df$value[i]<=as.numeric(df[i,2:5])%>%na.omit())){

#sampling procedure here

}
}  

Here is an example of the df layout:


    ID    rep1   rep2   rep3   rep4  y1  y2  value
1   a     NA     NA     NA     NA    5   2   -400
2   b     -400   NA     NA     NA    7   5   -300
3   c     -400   -300   NA     NA    3   3   -200
4   d     -400   -300   -200   NA    4   6   -300
5   e     -400   -300   -200   -300  9   7   -400
6   f     NA     NA     NA     NA    2   3   -400
7   g     -400   NA     NA     NA    3   2   -400
8   h     NA     NA     NA     NA    6   4   -400
9   i     NA     NA     NA     NA    7   4   -200
10  j     -200   -300   NA     NA    7   6   -300
11  k     -300   NA     NA     NA    8   9   -200
12  l     NA     NA     NA     NA    3   7   -300
13  m     NA     NA     NA     NA    4   7   -300
structure(list(ID = structure(1:13, .Label = c("a", "b", "c", 
"d", "e", "f", "g", "h", "i", "j", "k", "l", "m"), class = "factor"), 
    rep1 = c(NA, -400L, -400L, -400L, -400L, NA, -400L, NA, NA, 
    -200L, -300L, NA, NA), rep2 = c(NA, NA, -300L, -300L, -300L, 
    NA, NA, NA, NA, -300L, NA, NA, NA), rep3 = c(NA, NA, NA, 
    -200L, -200L, NA, NA, NA, NA, NA, NA, NA, NA), rep4 = c(NA, 
    NA, NA, NA, -300L, NA, NA, NA, NA, NA, NA, NA, NA), y1 = c(5L, 
    7L, 3L, 4L, 9L, 2L, 3L, 6L, 7L, 7L, 8L, 3L, 4L), y2 = c(2L, 
    5L, 3L, 6L, 7L, 3L, 2L, 4L, 4L, 6L, 9L, 7L, 7L), value = c(-400L, 
    -300L, -200L, -300L, -400L, -400L, -400L, -400L, -200L, -300L, 
    -200L, -300L, -300L)), class = "data.frame", row.names = c(NA, 
-13L))

I'd imagine this should work for the basic plots

ggplot(data = df, aes(x = value, y = y1)) +
  geom_smooth(method = lm, se = FALSE)
ggplot(data = df, aes(x = value, y = y2)) +
  geom_smooth(method = lm, se = FALSE)

Upvotes: 0

Views: 509

Answers (1)

Peter_Evan
Peter_Evan

Reputation: 947

Let's assume that your sampling procedure works as intended, and that the goal is merely to track the data frames from the while loop and plot this using geom_smooth (please clarify if I am misunderstanding). You could just save the variables of interest to a data frame, include an ID to track each data frame, and then group by these IDs when plotting. Below I am using the data you provided.

library(tidyverse)
set.seed(4)

#an empty data frame to save our output
toplot <- data.frame(ID = NA, value = NA, y1 = NA, y2 = NA)

#loop for 50 times for this example
for(i in 1:50){

  #sampling the df. This would be your while loop. 
  #Just save your output from the while loop as an data frame object
  d1 <- sample_n(df, 5)

  #save the values of interest
  toplot_TMP <- data.frame(value = d1$value, y1 = d1$y1, y2 = d1$y2)

  #create ID variable
  toplot_TMP$ID <- i

  #bind to our data frame for later
  toplot <- bind_rows(toplot,toplot_TMP)

 }

#drop NA artifact
toplot <- na.omit(toplot)

#plotting with group = ID
ggplot(data = toplot, aes(x = value, y = y1, group = ID)) +
  geom_smooth(method = lm, se = FALSE)

enter image description here

Gussy up as you need to.

Upvotes: 1

Related Questions