Paolo Lorenzini
Paolo Lorenzini

Reputation: 725

Looping over columns in R data frame

I want to loop over specific columns in a data frame and create new columns

I have a data frame which looks like this:

  a   b  c d
2.8   A  A T    
1.9   T  G T 
1.7   G  G A 
2.3   T  T G

I would like an output like this:

a    b  c   c_1   d d_1
2.8  A  A   2.8   T  0 
1.9. T  G   0     T  1.9 
1.7  G  G   1.7   A  0
2.3  T  T   2.3   G  0
1.2  C  G   0     C  1.2

Basically it creates a new column c_1 or d_1 with the value equal to a if the letter in c or d is the same as in b and equal to zero if the letter is different.

I can do that for only one single column:

df$c_1 <-ifelse(df$c==df$b,df$a,0)

However I have many columns (around 100), how to do that for all the columns?

Upvotes: 1

Views: 138

Answers (1)

iod
iod

Reputation: 7592

Using dplyr (dd is the name of the dataframe):

bind_cols(dd,transmute_at(dd, 3:4, ~ifelse(.==dd$b, dd$a,0)))

Transmute creates a dataframe with only the new columns. the 3:4 is where I select which columns get changed - in this case simply by giving a vector of their indexes. Finally, bind_cols is the dplyr variation on cbind, which renames the new columns to avoid duplicates.

Result:

    a b c d  c1  d1
1 2.8 A A T 2.8 0.0
2 1.9 T G T 0.0 1.9
3 1.7 G G A 1.7 0.0
4 2.3 T T G 2.3 0.0

If you want the columns sorted like in your example you can add this:

%>% .[,sort(names(.))]

Which will give you:

    a b c  c1 d  d1
1 2.8 A A 2.8 T 0.0
2 1.9 T G 0.0 T 1.9
3 1.7 G G 1.7 A 0.0
4 2.3 T T 2.3 G 0.0

Upvotes: 3

Related Questions