Reputation: 447
I have a data frame like below with lots of columns and data in
Label 2015-08-24 2015-08-25
1 AxG 179238.8 174160.3
2 CHaF 203544.2 199411.9
3 HHHH 130138.9 129182.2
4 fLS 146367.0 148708.0
5 KSU 194584.7 188508.0
6 MqCC 272922.7 273369.5
7 BEPU 0.0 0.0
8 SSEX 107566.4 111552.8
9 QQQ 248380.4 257064.2
10 AAA 258735.4 260415.5
I have another dataframe that has row index the same as the column names in the first (in this case dates). I want to divide every value in the column of the first dataframe with the corresponding value in the second
totals
2015-08-24 1741479
2015-08-25 199411.9
My first idea was to do something like below (I come from a pythonic background, albeit a while ago)
> for(i in names(df1)){
+ df1[[df1[,i]/df2[i,]]]
But this is the error I receive:
Error in .subset2(x, i, exact = exact) : no such index at level 1 In addition: Warning message: In Ops.factor(weights[, i], weight_totals[i, ]) : ‘/’ not meaningful for factors
Any ideas?
Upvotes: 3
Views: 1056
Reputation: 24178
You could transpose your second data.frame
and multiply the matching columns using intersect()
dft <- data.frame(t(df2), check.names = FALSE) # transpose df2
mapply("/", df[intersect(names(df), names(dft))],
dft[intersect(names(df), names(dft))]) # perfrom division
# 2015-08-24 2015-08-25
# [1,] 0.10292332 0.8733696
# [2,] 0.11688008 1.0000000
# [3,] 0.07472895 0.6478159
# [4,] 0.08404753 0.7457328
# [5,] 0.11173531 0.9453197
# [6,] 0.15671892 1.3708786
# [7,] 0.00000000 0.0000000
# [8,] 0.06176727 0.5594089
# [9,] 0.14262612 1.2891116
#[10,] 0.14857222 1.3059176
Data used:
df <- structure(list(Label = structure(c(2L, 4L, 6L, 5L, 7L, 8L, 3L,
10L, 9L, 1L), .Label = c("AAA", "AxG", "BEPU", "CHaF", "fLS",
"HHHH", "KSU", "MqCC", "QQQ", "SSEX"), class = "factor"), `2015-08-24` = c(179238.8,
203544.2, 130138.9, 146367, 194584.7, 272922.7, 0, 107566.4,
248380.4, 258735.4), `2015-08-25` = c(174160.3, 199411.9, 129182.2,
148708, 188508, 273369.5, 0, 111552.8, 257064.2, 260415.5)), .Names = c("Label",
"2015-08-24", "2015-08-25"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))
df2 <- structure(list(totals = c(1741479, 199411.9)), .Names = "totals", row.names = c("2015-08-24",
"2015-08-25"), class = "data.frame")
Upvotes: 4
Reputation: 83215
Altough @PierreLafortune has the shortest solution, using sweep
requires that the order of the column names of df
and the rownames of df2
has to be the same. When using for
(and I suppose also with the intersect
method of @mtoto) the column names of df
and the rownames of df2
don't have to be in the same order to get the desired results. Another advantage is that df
is updated immediately:
for(i in names(df)[-1])
df[, i] <- df[, i] / df2$totals[match(i, row.names(df2))]
which gives:
> df
Label 2015-08-24 2015-08-25
1 AxG 0.10292332 0.8733696
2 CHaF 0.11688008 1.0000000
3 HHHH 0.07472895 0.6478159
4 fLS 0.08404753 0.7457328
5 KSU 0.11173531 0.9453197
6 MqCC 0.15671892 1.3708786
7 BEPU 0.00000000 0.0000000
8 SSEX 0.06176727 0.5594089
9 QQQ 0.14262612 1.2891116
10 AAA 0.14857222 1.3059176
Upvotes: 2
Reputation: 28441
We can also use sweep
. It is a useful function for any case where a vector of values must be applied over an array. We choose margin 2
to sweep over the columns (we could have chosen 1
for rows):
sweep(df[-1], 2, df2$totals, '/')
# 2015-08-24 2015-08-25
# [1,] 0.10292332 0.8733696
# [2,] 0.11688008 1.0000000
# [3,] 0.07472895 0.6478159
# [4,] 0.08404753 0.7457328
# [5,] 0.11173531 0.9453197
# [6,] 0.15671892 1.3708786
# [7,] 0.00000000 0.0000000
# [8,] 0.06176727 0.5594089
# [9,] 0.14262612 1.2891116
# [10,] 0.14857222 1.3059176
Upvotes: 5