user5316628
user5316628

Reputation: 357

Passing function to multiple columns based on column name

I have a data frame (DF) with 3 columns with values and 200 columns with only a heading and NA values. As an an example:

  Row Price  Qty   2.10  2.15  2.20  2.25  ....
    1  2.10  100      0     0     0     0
    2  2.15  200      0     0     0     0   
    3  2.25  100      0     0     0     0    
    4  2.10  100      0     0     0     0  
    5  2.25  300      0     0     0     0  

I am attempting to use dplyr and mutate_each to pass a function to all columns 4:n that uses the column headers name. The function will mutate each column so that for in the row where Price = Column Name, then the Qty amount is applied. See desired results bellow:

  Row Price  Qty   2.10  2.15  2.20  2.25  ....
    1  2.10  100    100     0     0     0
    2  2.15  200      0   200     0     0   
    3  2.25  100      0     0     0   100    
    4  2.10  100    100     0     0     0  
    5  2.25  300      0     0     0   300  

Any thoughts on how to execute this?


# input data
DF <- structure(list(Row = 1:5, Price = c(2.1, 2.15, 2.25, 2.1, 2.25
), Qty = c(100L, 200L, 100L, 100L, 300L), X2.10 = c(0L, 0L, 0L, 
0L, 0L), X2.15 = c(0L, 0L, 0L, 0L, 0L), X2.20 = c(0L, 0L, 0L, 
0L, 0L), X2.25 = c(0L, 0L, 0L, 0L, 0L)), .Names = c("Row", "Price", 
"Qty", "2.10", "2.15", "2.20", "2.25"), class = "data.frame", row.names = c(NA, 
-5L))   

Upvotes: 1

Views: 124

Answers (2)

Arun
Arun

Reputation: 118779

Another way is to do a join as follows:

require(data.table) # v1.9.6+
setDT(DF)
for (p in tail(names(DF), -3L))
    DF[.(Price=as.numeric(p)), (p) := Qty, on="Price"]
DF[]
#    Row Price Qty 2.10 2.15 2.20 2.25
# 1:   1  2.10 100  100    0    0    0
# 2:   2  2.15 200    0  200    0    0
# 3:   3  2.25 100    0    0    0  100
# 4:   4  2.10 100  100    0    0    0
# 5:   5  2.25 300    0    0    0  300

Joins can be combined with aggregations and updates (just like subsets are) with data.table's syntax. The idea is to extract the rows that match using the i argument, and update those rows for that column with the corresponding value of Qty.

Note that there are no copies made here. The original data.table is updated by reference.

Upvotes: 1

Frank
Frank

Reputation: 66819

You can create the price cols by reshaping:

library(reshape2)
dcast(DF[1:3], Row+Price+Qty ~ Price, value.var = "Qty", fill = 0) 

#   Row Price Qty 2.1 2.15 2.25
# 1   1  2.10 100 100    0    0
# 2   2  2.15 200   0  200    0
# 3   3  2.25 100   0    0  100
# 4   4  2.10 100 100    0    0
# 5   5  2.25 300   0    0  300

Prices that don't show up in the second column will not get their own columns using the approach above. If that edge case was important, I would use data.table:

library(data.table)
setDT(DF)
for (j in names(DF)[-(1:3)]){  
  ii = which( DF$Price == as.numeric(j) )
  set(DF, i=ii, j=j, v=DF$Qty[ii]) }

#   Row Price Qty 2.10 2.15 2.20 2.25
# 1   1  2.10 100  100    0    0    0
# 2   2  2.15 200    0  200    0    0
# 3   3  2.25 100    0    0    0  100
# 4   4  2.10 100  100    0    0    0
# 5   5  2.25 300    0    0    0  300

I'm sure something similar can be done with mutate_each.

Upvotes: 6

Related Questions