Reputation: 725
I want to loop over specific columns in a data frame and create new columns
I have a data frame which looks like this:
a b c d
2.8 A A T
1.9 T G T
1.7 G G A
2.3 T T G
I would like an output like this:
a b c c_1 d d_1
2.8 A A 2.8 T 0
1.9. T G 0 T 1.9
1.7 G G 1.7 A 0
2.3 T T 2.3 G 0
1.2 C G 0 C 1.2
Basically it creates a new column c_1
or d_1
with the value equal to a if the letter in c or d is the same as in b and equal to zero if the letter is different.
I can do that for only one single column:
df$c_1 <-ifelse(df$c==df$b,df$a,0)
However I have many columns (around 100), how to do that for all the columns?
Upvotes: 1
Views: 138
Reputation: 7592
Using dplyr
(dd is the name of the dataframe):
bind_cols(dd,transmute_at(dd, 3:4, ~ifelse(.==dd$b, dd$a,0)))
Transmute creates a dataframe with only the new columns. the 3:4
is where I select which columns get changed - in this case simply by giving a vector of their indexes. Finally, bind_cols
is the dplyr
variation on cbind, which renames the new columns to avoid duplicates.
Result:
a b c d c1 d1
1 2.8 A A T 2.8 0.0
2 1.9 T G T 0.0 1.9
3 1.7 G G A 1.7 0.0
4 2.3 T T G 2.3 0.0
If you want the columns sorted like in your example you can add this:
%>% .[,sort(names(.))]
Which will give you:
a b c c1 d d1
1 2.8 A A 2.8 T 0.0
2 1.9 T G 0.0 T 1.9
3 1.7 G G 1.7 A 0.0
4 2.3 T T 2.3 G 0.0
Upvotes: 3