Reputation: 1088
I have following example matrix x.
x <- data.frame(c1=c(1,2,3,2,1,3),
c2=c(4,5,6,2,3,4),
c3=c(7,8,9,7,1,6),
c4=c(4,0,9,1,5,0),
c5=c(3,8,0,7,3,6),
c6=c(2,8,5,0,5,7),
row.names = c("r1","r2","r3","r4","r5","r6"))
I need to apply function f to each column where cMin is the column minimum and cMax is the column maximum vectors.
cMax <- colMaxs(mat)
cMin <- colMins(mat)
I am trying to use apply function apply(mat,2,f)
as shown below but getting warnings and the result is not correct as well.
f <- function(x) (x - cMin[])/(cMax - cMin)
warnings: Warning messages:
1: In x - cMin[] :
longer object length is not a multiple of shorter object length
2: In (x - cMin[])/(cMax - cMin) :
longer object length is not a multiple of shorter object length
3: In x - cMin[] :
longer object length is not a multiple of shorter object length
4: In (x - cMin[])/(cMax - cMin) :
longer object length is not a multiple of shorter object length
Can someone explain how to use the apply function consisting a vector (cMin or cMax)?
Upvotes: 0
Views: 83
Reputation: 887881
We can just replicate the 'cMin' and 'cMax' and do the calculation
(mat - cMin[col(mat)])/(cMax[col(mat)] - cMin[col(mat)])
# c1 c2 c3 c4 c5 c6
#r1 0.0 0.50 0.750 0.4444444 0.375 0.250
#r2 0.5 0.75 0.875 0.0000000 1.000 1.000
#r3 1.0 1.00 1.000 1.0000000 0.000 0.625
#r4 0.5 0.00 0.750 0.1111111 0.875 0.000
#r5 0.0 0.25 0.000 0.5555556 0.375 0.625
#r6 1.0 0.50 0.625 0.0000000 0.750 0.875
Upvotes: 1
Reputation: 6222
As I see from the solutions, the aim it so scale each column to range 0 to 1, linearly, with the smallest value mapping to 0 and maximum to 1.
In like one line, without having to calculate cMin
and cMax
apply(x, 2,
function(each_col) (each_col - min(each_col))/diff(range(each_col)))
# c1 c2 c3 c4 c5 c6
# r1 0.0 0.50 0.750 0.4444444 0.375 0.250
# r2 0.5 0.75 0.875 0.0000000 1.000 1.000
# r3 1.0 1.00 1.000 1.0000000 0.000 0.625
# r4 0.5 0.00 0.750 0.1111111 0.875 0.000
# r5 0.0 0.25 0.000 0.5555556 0.375 0.625
# r6 1.0 0.50 0.625 0.0000000 0.750 0.875
Upvotes: 1
Reputation: 1187
library(magrittr)
x <- data.frame(c1=c(1,2,3,2,1,3),
c2=c(4,5,6,2,3,4),
c3=c(7,8,9,7,1,6),
c4=c(4,0,9,1,5,0),
c5=c(3,8,0,7,3,6),
c6=c(2,8,5,0,5,7),
row.names = c("r1","r2","r3","r4","r5","r6"))
cMin <- apply(x, MARGIN = 2, FUN = min)
cMax <- apply(x, MARGIN = 2, FUN = max)
sweep(x, MARGIN = 2, STATS = cMin, FUN = "-") %>%
sweep(., MARGIN = 2, STATS = (cMax - cMin), FUN = "/")
c1 c2 c3 c4 c5 c6
r1 0.0 0.50 0.750 0.4444444 0.375 0.250
r2 0.5 0.75 0.875 0.0000000 1.000 1.000
r3 1.0 1.00 1.000 1.0000000 0.000 0.625
r4 0.5 0.00 0.750 0.1111111 0.875 0.000
r5 0.0 0.25 0.000 0.5555556 0.375 0.625
r6 1.0 0.50 0.625 0.0000000 0.750 0.875
Upvotes: 1
Reputation: 215117
When subtracting a vector from a matrix, the vector is aligned by columns due to the storage mechanism of a matrix and the recycling rule; So you can transpose the matrix
, do the calculations with cMin
, cMax
and then transpose it back:
t((t(mat) - cMin)/(cMax - cMin))
# c1 c2 c3 c4 c5 c6
#r1 0.0 0.50 0.750 0.4444444 0.375 0.250
#r2 0.5 0.75 0.875 0.0000000 1.000 1.000
#r3 1.0 1.00 1.000 1.0000000 0.000 0.625
#r4 0.5 0.00 0.750 0.1111111 0.875 0.000
#r5 0.0 0.25 0.000 0.5555556 0.375 0.625
#r6 1.0 0.50 0.625 0.0000000 0.750 0.875
Upvotes: 2