Ryan
Ryan

Reputation: 423

Conditionally replace the values in columns to value in another column using dplyr

I tried really hard to find an answer to this and I apologize if it's a duplicate.

I'll make some dummy data to explain my question.

tibble(a=c(0.1, 0.2, 0.3), sample1 = c(0, 1, 1), sample2 = c(1, 1, 0))

# A tibble: 3 x 3
      a sample1 sample2
 <dbl>   <dbl>   <dbl>
1   0.1       0       1
2   0.2       1       1
3   0.3       1       0

How to I conditionally change the values in columns sample1 and sample2 so that if they are equal to one, they take on the value of a.

The resulting tibble should look like this:

# A tibble: 3 x 3
      a sample1 sample2
 <dbl>   <dbl>   <dbl>
1   0.1       0     0.1
2   0.2     0.2     0.2
3   0.3     0.3       0

Ideally I don't want to do this for each individual sample column (I have >100 sample columns), so a way to loop over columns would be better (although I know loops are the devil).

Thanks for your help!

Upvotes: 1

Views: 3187

Answers (3)

Alan G&#243;mez
Alan G&#243;mez

Reputation: 378

A Base R way to do this:

DATA

df <- data.frame(a=c(0.1, 0.2, 0.3), sample1 = c(0, 1, 1), sample2 = c(1, 1, 0))

PROCEDURE

df[,2:ncol(df)] <- t(sapply(c(1:nrow(df)), function(x) ifelse(df[x,2:ncol(df)]==1, df[x,1],0)))

OR

df[,2:ncol(df)] <- ((df==1)*rep(df[,1],ncol(df)))[,2:ncol(df)]

OUTPUT

df
    a sample1 sample2
1 0.1     0.0     0.1
2 0.2     0.2     0.2
3 0.3     0.3     0.0

Upvotes: 0

Andrew Haynes
Andrew Haynes

Reputation: 2640

Non-dplyr solution using which():

> t=tibble(a=c(0.1, 0.2, 0.3), sample1 = c(0, 1, 1), sample2 = c(1, 1, 0))

> whichRows=which(t$sample1==t$sample2)

> t[whichRows,c('sample1','sample2')]<-t[whichRows,'a']

> t
# A tibble: 3 x 3
      a sample1 sample2
  <dbl>   <dbl>   <dbl>
1   0.1     0.0     1.0
2   0.2     0.2     0.2
3   0.3     1.0     0.0

Upvotes: 0

akuiper
akuiper

Reputation: 214957

You can use mutate_at with ifelse:

df %>% mutate_at(vars(starts_with('sample')), funs(ifelse(. == 1, a, .)))

# A tibble: 3 x 3
#      a sample1 sample2
#  <dbl>   <dbl>   <dbl>
#1   0.1     0.0     0.1
#2   0.2     0.2     0.2
#3   0.3     0.3     0.0

vars(starts_with('sample')) matches all columns that starts with sample and mutate_at applies the function funs(ifelse(. == 1, a, .)) to each column; . stands for the matched column here.


If you are sure all the samples columns contain only 1 and 0, it can be shortened as:

df %>% mutate_at(vars(starts_with('sample')), funs(. * a))

# A tibble: 3 x 3
#      a sample1 sample2
#  <dbl>   <dbl>   <dbl>
#1   0.1     0.0     0.1
#2   0.2     0.2     0.2
#3   0.3     0.3     0.0

Upvotes: 5

Related Questions