user18224546
user18224546

Reputation:

Create percentage column of the number that appears the most times

The code below generates the mode between columns df1 to df3of my database. Now what I would like to do is the following: I would like to insert one more column, which can be called a Percentage, which will show the percentage of the number that appears more often between the columns df1 to df3.

Example: For the first row there will be no percentage, since I have 3 different numbers. For the second, third and fourth rows, the percentage is going to be 100%, since I have 3 numbers 29, 3 numbers 28 and 3 numbers 27, respectively. However, for the fifth row, I have 2 numbers 25 and 1 number 26, so the percentage is going to be 66.66%, and so on.

library(dplyr)

result<-structure(list(n = c(7, 8, 9, 10, 11, 12, 13, 14, 15, 
    16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 
    32, 33, 34, 35), df1 = c(9L, 29L, 28L, 27L, 25L, 26L, 24L, 20L, 
    21L, 22L, 23L, 15L, 12L, 17L, 18L, 19L, 16L, 13L, 14L, 5L, 6L, 
    7L, 8L, 1L, 10L, 11L, 4L, 2L, 3L), df2 = c(3, 29, 28, 27, 26, 
    25, 24, 23, 22, 21, 20, 15, 12, 19, 18, 17, 16, 14, 13, 11, 10, 
    9, 8, 7, 6, 5, 4, 1, 2), df3 = c(1L, 29L, 28L, 27L, 25L, 26L, 
    24L, 20L, 21L, 22L, 23L, 15L, 12L, 17L, 18L, 19L, 16L, 13L, 14L, 
    5L, 6L, 7L, 8L, 9L, 10L, 11L, 4L, 2L, 3L)), row.names = c(NA, 
    -29L), class = "data.frame")

ModeFunc <- function(Vec) {
   tmp <- sort(table(Vec),decreasing = TRUE)
   Nms <- names(tmp)
   if(max(tmp) > 1) {
     as.numeric(Nms[1])
   } else NA
 }
result <- result |> rowwise() |> 
   mutate(Mode = ModeFunc(c_across(df1:df3)))
data.frame(result)

    n df1 df2 df3 Mode
1   7   9   3   1   NA
2   8  29  29  29   29
3   9  28  28  28   28
4  10  27  27  27   27
5  11  25  26  25   25
6  12  26  25  26   26
7  13  24  24  24   24
8  14  20  23  20   20
9  15  21  22  21   21
10 16  22  21  22   22
11 17  23  20  23   23
12 18  15  15  15   15
13 19  12  12  12   12
14 20  17  19  17   17
15 21  18  18  18   18
16 22  19  17  19   19
17 23  16  16  16   16
18 24  13  14  13   13
19 25  14  13  14   14
20 26   5  11   5    5
21 27   6  10   6    6
22 28   7   9   7    7
23 29   8   8   8    8
24 30   1   7   9   NA
25 31  10   6  10   10
26 32  11   5  11   11
27 33   4   4   4    4
28 34   2   1   2    2
29 35   3   2   3    3

Upvotes: 2

Views: 41

Answers (2)

rdelrossi
rdelrossi

Reputation: 1154

How's this?

result %>% 
  rowwise() %>% 
  mutate(Percentage = sum(c_across(df1:df3) == Mode) / 3 * 100) %>% 
  mutate(Percentage = ifelse(
    is.na(Percentage), "", sprintf("%0.2f%%", Percentage))
  )

Output:

# A tibble: 29 × 6
# Rowwise: 
       n   df1   df2   df3  Mode Percentage
   <dbl> <int> <dbl> <int> <dbl> <chr>     
 1     7     9     3     1    NA ""        
 2     8    29    29    29    29 "100.00%" 
 3     9    28    28    28    28 "100.00%" 
 4    10    27    27    27    27 "100.00%" 
 5    11    25    26    25    25 "66.67%"  
 6    12    26    25    26    26 "66.67%"  
 7    13    24    24    24    24 "100.00%" 
 8    14    20    23    20    20 "66.67%"  
 9    15    21    22    21    21 "66.67%"  
10    16    22    21    22    22 "66.67%"  
# … with 19 more rows

Upvotes: 1

zephryl
zephryl

Reputation: 17134

Test whether each column == Mode using across(), then count up TRUEs using rowSums(); then divide by the number of columns to convert to percentage:

result <- result |> 
   rowwise() |> 
   mutate(
     Mode = ModeFunc(c_across(df1:df3)),
     Percentage = rowSums(across(df1:df3, ~ .x == Mode) / 3)
   )

head(data.frame(result))

Output:

   n df1 df2 df3 Mode Percentage
1  7   9   3   1   NA         NA
2  8  29  29  29   29  1.0000000
3  9  28  28  28   28  1.0000000
4 10  27  27  27   27  1.0000000
5 11  25  26  25   25  0.6666667
6 12  26  25  26   26  0.6666667

Upvotes: 0

Related Questions