iskandarblue
iskandarblue

Reputation: 7526

divide max value in col by sum of values in row

I have a matrix and my objective is to find the maximum of each column and then to divide that number by the sum of all values in the row which contains the max of that column. In other words

max(y) / sum of values in the row where y is the max

How would apply this formula to every column in R ?

> the_matrix
Source: local data frame [20 x 10]

     type   100  100F  100I  100X   101   102 1028P   103  103D
   (fctr) (int) (int) (int) (int) (int) (int) (int) (int) (int)
1       0    NA    NA    NA    NA    NA    NA    NA    NA    NA
2      0A     2    NA    NA    NA    NA    NA    NA    NA    NA
3      0B    NA    NA    NA    NA    NA    NA    NA    NA    NA
4      0C    NA    NA    NA    NA    NA    NA    NA    NA    NA
5      0E    NA    NA    NA    NA    NA    NA    NA    NA    NA
6      0G    NA    NA    NA    NA    NA    NA    NA    NA    NA
7      0O    NA    NA    NA    NA    NA    NA    NA    NA    NA
8      0Z    NA    NA    NA    NA    NA    NA    NA    NA    NA
9       1     2    NA    NA    NA    NA    NA    NA    NA    NA
10     1A  3968    NA   214    26     4   289     8 56030  7484
11     1B   172    NA   107    NA    NA     2    NA   372  3829
12     1C   584    NA    19    NA    NA     1    NA 72951   363
13     1D    27    NA    NA    NA    NA    NA    NA   365    22
14     1E 27944    16    68    NA    NA    NA     1    62    12
15     1F     1    NA     1    NA    NA     1    NA   368    27
16     1G     4    NA    NA    NA    NA    NA    NA     7    NA
17     1H    65    NA     6    21     1     6     3   714    59
18     1M    NA    NA    NA    NA    NA    NA    NA     1    NA
19     1N    NA    NA    NA    NA    NA    NA    NA    NA    NA
20     1Q    NA    NA    NA    NA    NA    NA    NA    NA    NA
> dput(the_matrix)
structure(list(type = structure(1:20, .Label = c("0", "0A", "0B", 
"0C", "0E", "0G", "0O", "0Z", "1", "1A", "1B", "1C", "1D", "1E", 
"1F", "1G", "1H", "1M", "1N", "1Q", "1S", "1X", "1Z", "2", "2A", 
"2B", "2C", "2D", "2E", "2F", "2G", "2H", "2I", "2J", "2M", "2S", 
"2T", "2X", "2Z", "3", "3B", "3C", "3E", "4B", "5H", "8Z", "0H", 
"1I", "1R", "2N", "3H", "5D", "0D", "1K", "1P", "1T", "1U", "1V", 
"1W", "1Y", "2U", "3A", "4A", "5C", "7H", "9", "0F", "0T", "1J", 
"2L", "0W", "2Q", "3G"), class = "factor"), `100` = c(NA, 2L, 
NA, NA, NA, NA, NA, NA, 2L, 3968L, 172L, 584L, 27L, 27944L, 1L, 
4L, 65L, NA, NA, NA), `100F` = c(NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, 16L, NA, NA, NA, NA, NA, NA), `100I` = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, 214L, 107L, 19L, NA, 68L, 1L, 
NA, 6L, NA, NA, NA), `100X` = c(NA, NA, NA, NA, NA, NA, NA, NA, 
NA, 26L, NA, NA, NA, NA, NA, NA, 21L, NA, NA, NA), `101` = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, 4L, NA, NA, NA, NA, NA, NA, 1L, 
NA, NA, NA), `102` = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 289L, 
2L, 1L, NA, NA, 1L, NA, 6L, NA, NA, NA), `1028P` = c(NA, NA, 
NA, NA, NA, NA, NA, NA, NA, 8L, NA, NA, NA, 1L, NA, NA, 3L, NA, 
NA, NA), `103` = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, 56030L, 
372L, 72951L, 365L, 62L, 368L, 7L, 714L, 1L, NA, NA), `103D` = c(NA, 
NA, NA, NA, NA, NA, NA, NA, NA, 7484L, 3829L, 363L, 22L, 12L, 
27L, NA, 59L, NA, NA, NA)), .Names = c("type", "100", "100F", 
"100I", "100X", "101", "102", "1028P", "103", "103D"), class = c("tbl_df", 
"data.frame"), row.names = c(NA, -20L))

Upvotes: 2

Views: 265

Answers (1)

Gregor Thomas
Gregor Thomas

Reputation: 145755

Going step-by-step:

# let's not call a data frame a matrix
real_matrix = as.matrix(the_matrix[, -1])

# max of each column
col_max = apply(real_matrix, 2, max, na.rm = T)
# which row contains the max
col_which_max = apply(real_matrix, 2, which.max)
# row totals
row_total = rowSums(real_matrix, na.rm = T)

# col max divided by row total for corresponding row
col_max / row_total[col_which_max]

Rounded to 3 decimals, this yields the following:

  100  100F  100I  100X   101   102 1028P   103  103D 
0.994 0.001 0.003 0.000 0.000 0.004 0.000 0.987 0.110 

Upvotes: 3

Related Questions