beginner
beginner

Reputation: 1069

Loop over specific columns data and add the result as a new column in R

I have a dataframe df with following information:

df <- structure(list(Samples = structure(c(1L, 3L, 4L, 5L, 6L, 7L, 
8L, 9L, 10L, 2L, 1L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 2L, 1L, 
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 2L, 1L, 3L, 4L, 5L, 6L, 7L, 
8L, 9L, 10L, 2L), .Label = c("Sample1", "Sample10", "Sample2", 
"Sample3", "Sample4", "Sample5", "Sample6", "Sample7", "Sample8", 
"Sample9"), class = "factor"), patient.vital_status = c(0L, 0L, 
0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 
0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 
0L, 0L, 0L, 1L, 0L, 1L), years = c(3.909589041, 1.457534247, 
2.336986301, 5.010958904, 1.665753425, 1.81369863, 1.191780822, 
4.687671233, 2.167123288, 1.95890411, 3.909589041, 1.457534247, 
2.336986301, 5.010958904, 1.665753425, 1.81369863, 1.191780822, 
4.687671233, 2.167123288, 1.95890411, 3.909589041, 1.457534247, 
2.336986301, 5.010958904, 1.665753425, 1.81369863, 1.191780822, 
4.687671233, 2.167123288, 1.95890411, 3.909589041, 1.457534247, 
2.336986301, 5.010958904, 1.665753425, 1.81369863, 1.191780822, 
4.687671233, 2.167123288, 1.95890411), Genes = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("A1BG", "A1CF", "A2M", 
"A2ML1"), class = "factor"), value = c(0.034459012, 0.017698878, 
0.023313851, 0.010456762, 0.032674019, 0.037561831, 0.03380681, 
0, 0.019954956, 0.012392427, 0.835801613, 2.265192447, 2.431409095, 
5.012117956, 2.139962802, 2.371946704, 4.555234385, 0.550293401, 
0.924012327, 2.274642129, 92.85639578, 79.50897642, 23.72187602, 
26.86025304, 32.80504253, 222.6449054, 71.78812505, 45.76371588, 
29.93976676, 22.97515484, 0.03780441, 0.005825143, 0, 0.002867985, 
0.011948708, 0.02060423, 0.004636111, 0.015903347, 0.005473063, 
0.033988816)), class = "data.frame", row.names = c(NA, -40L))

I want to loop over the information based on the columns Genes and value and get a result. And again I want the result to be added to the dataframe df. The result will be with low or high.

I'm trying to do this with the following code, but it doesn't work:

genes <- as.character(unique(df$Genes))

library(survival)
library(survminer)

for(i in genes){
  surv_rnaseq.cut <- surv_cutpoint(
    df,
    time = "years",
    event = "patient.vital_status",
    variables = c("Genes","value"))

  df$cat <- surv_categorize(surv_rnaseq.cut)
}

Along with the above result I also wanted the summary for surv_rnaseq.cut for all the four genes with mentioning its name.

Any help please. thanq

Upvotes: 1

Views: 309

Answers (1)

akrun
akrun

Reputation: 887541

An option would be to split by 'genes' (group_split), loop over the list, apply the functions and bind the list elements after creating the column

library(survminer)
library(survival)
library(dplyr)
library(purrr)
df %>% 
  group_split(Genes) %>%
  map_dfr(~ surv_cutpoint(.x, 
                         time = "years",
                         event = "patient.vital_status",
                         variables = c("Genes", "value")) %>% 
                surv_categorize %>% 
                pull(value) %>%
                 mutate(.x, cat = .))
# A tibble: 40 x 6
#   Samples  patient.vital_status years Genes  value cat  
#   <fct>                   <int> <dbl> <fct>  <dbl> <chr>
# 1 Sample1                     0  3.91 A1BG  0.0345 high 
# 2 Sample2                     0  1.46 A1BG  0.0177 high 
# 3 Sample3                     0  2.34 A1BG  0.0233 high 
# 4 Sample4                     0  5.01 A1BG  0.0105 high 
# 5 Sample5                     0  1.67 A1BG  0.0327 high 
# 6 Sample6                     0  1.81 A1BG  0.0376 high 
# 7 Sample7                     0  1.19 A1BG  0.0338 high 
# 8 Sample8                     1  4.69 A1BG  0      low  
# 9 Sample9                     0  2.17 A1BG  0.0200 high 
#10 Sample10                    1  1.96 A1BG  0.0124 high 
# … with 30 more rows

Upvotes: 1

Related Questions