LDT
LDT

Reputation: 3108

Match case between three columns in R

Thank you in advance for your time reading this post. I have a data frame that looks like this.

data = data.frame(time=c(rep(c(0,1,2),each=3)),
                  parent=c(1,2,3,1,1,2,1,4,5), offspring= c(NA,NA,NA, 4,5,6,7,8,9))

time parent offspring 
 0     1       NA
 0     2       NA
 0     3       NA
 1     1        4
 1     1        5
 1     2        6
 2     1        7
 2     4        8
 2     5        9

I want to create a new column, "alpha" and assign to the offspring of the last time point (aka time point 2) the value "1".

time parent offspring  alpha
 0     1       NA        NA
 0     2       NA        NA
 0     3       NA        NA
 1     1        4        NA
 1     1        5        NA
 1     2        6        NA
 2     1        7        1
 2     4        8        1
 2     5        9        1

The tricky part for me is the next step. I would like to assign the parents of those offspring also with the value "1" as well as their grandparents and my data frame to look like this.

time parent offspring  alpha
 0     1       NA        1
 0     2       NA        NA
 0     3       NA        NA
 1     1        4        1
 1     1        5        1
 1     2        6        NA
 2     1        7        1
 2     4        8        1
 2     5        9        1

I have to tell you that I have thousands of generations. Any help and comment would be highly appreciated.

Upvotes: 0

Views: 120

Answers (2)

DaveArmstrong
DaveArmstrong

Reputation: 22034

If you have lots of generations, you might need a loop

## set alpha to 1 if time == max(time)
data$alpha <- ifelse(data$time == max(data$time), 1, NA)
## initialize inds
inds <- 1
## continue to loop while inds has any values in it
while(length(inds) > 0){
  ## identify points where alpha==NA and the parent is among the parents where alpha == 1 or the offspring are among the parents where alpha == 1
  inds <- which(is.na(data$alpha) & (data$parent %in% data$parent[which(data$alpha == 1)] | data$offspring %in% data$parent[which(data$alpha == 1)])
  ## replace those observations' alpha values with 1
  data$alpha[inds] <- 1
  ## continue to loop back through generations
}

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 389235

This might help :

#Assign `NA` to new column
data$alpha <- NA  
#get the parent value at max time period
parent <- unique(na.omit(data$parent[data$time == max(data$time)]))
#Change those values 1 wherever those value exist (offspring or parent)
data$alpha[data$parent %in% parent | data$offspring %in% parent] <- 1
data

#  time parent offspring alpha
#1    0      1        NA     1
#2    0      2        NA    NA
#3    0      3        NA    NA
#4    1      1         4     1
#5    1      1         5     1
#6    1      2         6    NA
#7    2      1         7     1
#8    2      4         8     1
#9    2      5         9     1

Upvotes: 1

Related Questions