Reputation: 63
I have a matrix of 1500 rows and 20 columns and would like to calculate the Coefficient of Divergence between columns (ie. sites). Each row is my hourly observation per site.
Calculation for COD between two sites is as stated in https://www.scribd.com/document/364585909/Coefficient-of-Divergence-COD
I have created a simpler dataset to work with:
x1 <- c(10, 20, 30, 40, 50)
x2 <- c(11, 21, 31, 41, 51)
x3 <- c(12, 22, 32, 42, 52)
x4 <- c(13, 23, 33, 43, 53)
x5 <- c(14, 24, 34, 44, 54)
x6 <- c(15, 25, 35, 44, 55)
xmat <- cbind(x1,x2,x3,x4,x5,x6)
site1 <- xmat
for (i in 2:ncol(site1))
site1[,i] = (xmat[,i]-xmat[,1])/(xmat[,i]+xmat[,1])
site1 <- site1[,2:6]
site1 <- sapply(site1, function(x) x^2)
COD-site1 <- data.frame(sqrt(colMeans(site1, na.rm=TRUE)))
site2 <- xmat
for (i in 3:ncol(site2))
site2[,i] = (xmat[,i]-xmat[,2])/(xmat[,i]+xmat[,2])
site2 <- site2[,3:6]
site2 <- sapply(site2, function(x) x^2)
COD-site2 <- data.frame(sqrt(colMeans(site2, na.rm=TRUE)))
etc....
I'm having a problem on how to loop through columns and create a new data frame for CODs. In my simple dataset, it's over 6 columns but in my actual dataset, I have 20 columns and don't really want to copy and paste 20 times. I'm a novice in writing function and loop for R and would appreciate a lending hand. Many thanks
Upvotes: 1
Views: 435
Reputation: 107652
With matrix objects, elemetwise operations of same sizes can be handled without loops such as for
and sapply
where you can run arithmetic directly on whole objects in vectorized processes:
Site 1
site1 <- xmat
site1[,2:ncol(site1)] <- (xmat[,2:ncol(site1)]-xmat[,1])/(xmat[,2:ncol(site1)]+xmat[,1])
site1 <- site1[,2:6]^2
COD_site1 <- data.frame(result=sqrt(colMeans(site1, na.rm=TRUE)))
COD_site1
Site 2
site2 <- xmat
site2[,3:ncol(site2)] <- (xmat[,3:ncol(site2)]-xmat[,2])/(xmat[,3:ncol(site2)]+xmat[,2])
site2 <- site2[,3:6]^2
COD_site2 <- data.frame(result=sqrt(colMeans(site2, na.rm=TRUE)))
COD_site2
To iterate across, generalize your processing and run in a loop like lapply
to build a list of return values. Have the loop iterate across the number of columns of original matrix minus first item (i.e., 2, 3, 4, 5, 6):
result_list <- lapply(seq(ncol(xmat))[-1], function(i) {
tryCatch({
site <- xmat
site[, i:ncol(site)] <- (xmat[,i:ncol(site)]-xmat[,i-1])/(xmat[,i:ncol(site)]+xmat[,i-1])
site <- as.matrix(site[,i:ncol(xmat)]^2)
COD_site <- data.frame(result=sqrt(colMeans(site, na.rm=TRUE)))
# RETURN LIST OF BOTH ITEMS
return(list(site, COD_site))
}, error = function(e) list(NA, NA)
)
})
Even name list elements according to site:
names(df_list) <- paste0("COD_site", seq(ncol(xmat)-1))
Even extract lists into their own types:
mat_lists <- lapply(result_list, `[[`, 1)
df_lists <- lapply(result_list, `[[`, 2)
mat_lists
mat_lists
# $COD_site1
# x2 x3 x4 x5 x6
# [1,] 0.0022675737 0.0082644628 0.0170132325 0.027777778 0.040000000
# [2,] 0.0005948840 0.0022675737 0.0048674959 0.008264463 0.012345679
# [3,] 0.0002687450 0.0010405827 0.0022675737 0.003906250 0.005917160
# [4,] 0.0001524158 0.0005948840 0.0013064305 0.002267574 0.002267574
# [5,] 0.0000980296 0.0003844675 0.0008483363 0.001479290 0.002267574
# $COD_site2
# x3 x4 x5 x6
# [1,] 8.218953e-05 5.787037e-04 1.728000e-03 3.641329e-03
# [2,] 1.257751e-05 9.391435e-05 2.962963e-04 6.575162e-04
# [3,] 3.999248e-06 3.051758e-05 9.831589e-05 2.226118e-04
# [4,] 1.748903e-06 1.349746e-05 4.396499e-05 4.396499e-05
# [5,] 9.151417e-07 7.111971e-06 2.332362e-05 5.373563e-05
# $COD_site3
# x4 x5 x6
# [1,] 2.560000e-06 3.501278e-05 1.524158e-04
# [2,] 2.438653e-07 3.573458e-06 1.659945e-05
# [3,] 5.602045e-08 8.432265e-07 4.019627e-06
# [4,] 1.915686e-08 2.925002e-07 2.925002e-07
# [5,] 8.227025e-09 1.267350e-07 6.179451e-07
# $COD_site4
# x5 x6
# [1,] 6.969172e-08 1.859344e-06
# [2,] 4.360243e-09 1.255867e-07
# [3,] 7.406721e-10 2.200926e-08
# [4,] 2.006336e-10 2.006336e-10
# [5,] 7.129862e-11 2.177866e-09
# $COD_site5
# [,1]
# [1,] 1.681171e-09
# [2,] 7.224762e-11
# [3,] 9.266281e-12
# [4,] 0.000000e+00
# [5,] 5.962673e-13
df_lists
df_lists
# $COD_site1
# result
# x2 0.02600634
# x3 0.05010383
# x4 0.07253009
# x5 0.09348300
# x6 0.11206961
# $COD_site2
# result
# x3 0.004504006
# x4 0.012031168
# x5 0.020927975
# x6 0.030394597
# $COD_site3
# result
# x4 0.0007599039
# x5 0.0028230728
# x6 0.0058982253
# $COD_site4
# result
# x5 0.0001225272
# x6 0.0006339273
# $COD_site5
# result
# 1 1.877915e-05
Upvotes: 1