Reputation: 357
I have a data frame (DF) with 3 columns with values and 200 columns with only a heading and NA values. As an an example:
Row Price Qty 2.10 2.15 2.20 2.25 ....
1 2.10 100 0 0 0 0
2 2.15 200 0 0 0 0
3 2.25 100 0 0 0 0
4 2.10 100 0 0 0 0
5 2.25 300 0 0 0 0
I am attempting to use dplyr
and mutate_each
to pass a function to all columns 4:n
that uses the column headers name. The function will mutate each column so that for in the row where Price = Column Name
, then the Qty
amount is applied. See desired results bellow:
Row Price Qty 2.10 2.15 2.20 2.25 ....
1 2.10 100 100 0 0 0
2 2.15 200 0 200 0 0
3 2.25 100 0 0 0 100
4 2.10 100 100 0 0 0
5 2.25 300 0 0 0 300
Any thoughts on how to execute this?
# input data
DF <- structure(list(Row = 1:5, Price = c(2.1, 2.15, 2.25, 2.1, 2.25
), Qty = c(100L, 200L, 100L, 100L, 300L), X2.10 = c(0L, 0L, 0L,
0L, 0L), X2.15 = c(0L, 0L, 0L, 0L, 0L), X2.20 = c(0L, 0L, 0L,
0L, 0L), X2.25 = c(0L, 0L, 0L, 0L, 0L)), .Names = c("Row", "Price",
"Qty", "2.10", "2.15", "2.20", "2.25"), class = "data.frame", row.names = c(NA,
-5L))
Upvotes: 1
Views: 124
Reputation: 118779
Another way is to do a join as follows:
require(data.table) # v1.9.6+
setDT(DF)
for (p in tail(names(DF), -3L))
DF[.(Price=as.numeric(p)), (p) := Qty, on="Price"]
DF[]
# Row Price Qty 2.10 2.15 2.20 2.25
# 1: 1 2.10 100 100 0 0 0
# 2: 2 2.15 200 0 200 0 0
# 3: 3 2.25 100 0 0 0 100
# 4: 4 2.10 100 100 0 0 0
# 5: 5 2.25 300 0 0 0 300
Joins can be combined with aggregations and updates (just like subsets are) with data.table's syntax. The idea is to extract the rows that match using the i
argument, and update those rows for that column with the corresponding value of Qty
.
Note that there are no copies made here. The original data.table is updated by reference.
Upvotes: 1
Reputation: 66819
You can create the price cols by reshaping:
library(reshape2)
dcast(DF[1:3], Row+Price+Qty ~ Price, value.var = "Qty", fill = 0)
# Row Price Qty 2.1 2.15 2.25
# 1 1 2.10 100 100 0 0
# 2 2 2.15 200 0 200 0
# 3 3 2.25 100 0 0 100
# 4 4 2.10 100 100 0 0
# 5 5 2.25 300 0 0 300
Prices that don't show up in the second column will not get their own columns using the approach above. If that edge case was important, I would use data.table:
library(data.table)
setDT(DF)
for (j in names(DF)[-(1:3)]){
ii = which( DF$Price == as.numeric(j) )
set(DF, i=ii, j=j, v=DF$Qty[ii]) }
# Row Price Qty 2.10 2.15 2.20 2.25
# 1 1 2.10 100 100 0 0 0
# 2 2 2.15 200 0 200 0 0
# 3 3 2.25 100 0 0 0 100
# 4 4 2.10 100 100 0 0 0
# 5 5 2.25 300 0 0 0 300
I'm sure something similar can be done with mutate_each
.
Upvotes: 6