wscampbell
wscampbell

Reputation: 379

Rowwise name of column where first non-zero value appears

I've got a bunch of columns all starting with the prefix wtp_ that occur in the midst of a wide dataframe (with several columns before and after the wtp_ columns). Mini example:

df <- tribble(~id, ~complete, ~wtp_20,~wtp_40,~wtp_60,~wtp_80,~wtp_100, ~sex,
          1, 1,  0,0,1,1,1,  "F",
          2, 0,  0,0,0,1,1,  "F",
          3, 0,  0,0,0,0,1,  "M",
          4, 1,  1,1,1,1,1,  "M",
          5, 1,  0,0,0,0,0,  "M",
          6, 0,  0,1,1,1,1,  "F"); df

What I'm looking for: I need to create a new variable (min_wtp) that returns the name of the column the first time that one of the wtp_ columns switches from 0 to 1. In other words, I need a solution to create the following:

df_needed <- tribble(~id, ~complete, ~wtp_20,~wtp_40,~wtp_60,~wtp_80,~wtp_100, ~sex, ~min_wtp,
          1, 1,  0,0,1,1,1,  "F", "wtp_60",
          2, 0,  0,0,0,1,1,  "F", "wtp_80",
          3, 0,  0,0,0,0,1,  "M", "wtp_100",
          4, 1,  1,1,1,1,1,  "M", "wtp_20",
          5, 1,  0,0,0,0,0,  "M", "NA",
          6, 0,  0,1,1,1,1,  "F", "wtp_40"); df_needed

Please note the following complications:

-Some people (like id==5) never change to 1 while others (like id==4) are 1 all along.
-There are some irrelevant columns occurring before the wtp_ columns that have 0s and 1s in them which should be ignored in the construction of min_wtp.
-There are way more columns (including wtp_ columns) than the minimal example I included above.

I've tried playing with which and colnames functions in combination with select(starts_with("wtp_")) but have been unsuccessful.

If anyone has a dplyr solution, that would be preferred.

Upvotes: 1

Views: 696

Answers (3)

thelatemail
thelatemail

Reputation: 93908

If it never goes backwards from 1 to 0, then you can find the change point very quickly with some basic sums:

sw  <- startsWith(names(df), "wtp_")
names(df[sw])[sum(sw) - rowSums(df[sw]) + 1]
#[1] "wtp_60"  "wtp_80"  "wtp_100" "wtp_20"  NA        "wtp_40"

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389135

It would be much easier if you get the data in long format :

library(dplyr)

df %>%
  tidyr::pivot_longer(cols = starts_with('wtp')) %>%
  group_by(id) %>%
  summarise(min_wtp = name[which(value == 1 & 
                           lag(value, default = 0) == 0)[1]]) %>%
  left_join(df, by = 'id')

# A tibble: 6 x 9
#     id min_wtp complete wtp_20 wtp_40 wtp_60 wtp_80 wtp_100 sex  
#  <dbl> <chr>      <dbl>  <dbl>  <dbl>  <dbl>  <dbl>   <dbl> <chr>
#1     1 wtp_60         1      0      0      1      1       1 F    
#2     2 wtp_80         0      0      0      0      1       1 F    
#3     3 wtp_100        0      0      0      0      0       1 M    
#4     4 wtp_20         1      1      1      1      1       1 M    
#5     5 NA             1      0      0      0      0       0 M    
#6     6 wtp_40         0      0      1      1      1       1 F   

Without reshaping the data you can use rowwise with c_across :

apply_fun <- function(x) {
    which(x == 1 & lag(x, default = 0) == 0)[1]
}

cols <- grep('^wtp', names(df), value = TRUE)

df %>%
  rowwise() %>%
  mutate(min_wtp = cols[apply_fun(c_across(cols))])

Upvotes: 2

eipi10
eipi10

Reputation: 93851

We can use apply to get, for each row, the number of first column that satisfies your conditions. Then we use that number as the index to get the column name.

df$min_wtp = apply(df[ , grepl("wtp", names(df))], 1, function(x) {
  names(x)[min(which(x > 0))]
  })

df
     id complete wtp_20 wtp_40 wtp_60 wtp_80 wtp_100 sex   min_wtp
  <dbl>    <dbl>  <dbl>  <dbl>  <dbl>  <dbl>   <dbl> <chr> <chr>  
1     1        1      0      0      1      1       1 F     wtp_60 
2     2        0      0      0      0      1       1 F     wtp_80 
3     3        0      0      0      0      0       1 M     wtp_100
4     4        1      1      1      1      1       1 M     wtp_20 
5     5        1      0      0      0      0       0 M     NA     
6     6        0      0      1      1      1       1 F     wtp_40

Upvotes: 2

Related Questions