alex
alex

Reputation: 844

Contingency table with ranks and percentages in R

I have data like the following:

ID    Flower   Season
1     tulip    Spring
3     rose     Summer
5     rose     Summer
9     tulip    Spring
2     daisy    Spring
12    violet   Summer

I want to get a table like the following, with ranks and percentages:

                             Season
                Spring                  Summer 
          Freq    %     Rank      Freq    %     Rank       Total
Flower

Tulip      2     66.7    1        0       0       3          2
Daisy      1     33.3    2        0       0       3          1
Rose       0     0       3        2       66.7    1          2
Violet     0     0       3        1       33.3    2          1

Upvotes: 0

Views: 158

Answers (1)

lovalery
lovalery

Reputation: 4662

Please find below one possible solution using dplyr and tidyr libraries

Reprex

  • Your data
df <- read.table(text="ID    Flower   Season
1     tulip    Spring
3     rose     Summer
5     rose     Summer
9     tulip    Spring
2     daisy    Spring
12    violet   Summer", header = TRUE)
  • Code
library(dplyr)
library(tidyr)

df <- df %>% 
  mutate(dummy = 1) %>% 
  xtabs(dummy ~ Flower + Season, .) %>% 
  as.data.frame() %>% 
  pivot_wider(., names_from = Season, values_from = Freq) %>% 
  rename(Spring_Freq = Spring, Summer_Freq = Summer) %>% 
  mutate(Spring_Perc = round(Spring_Freq/sum(Spring_Freq)*100,1),
         Summer_Perc = round(Summer_Freq/sum(Summer_Freq)*100,1),
         Spring_Rank = dense_rank(desc(Spring_Freq)),
         Summer_Rank = dense_rank(desc(Summer_Freq)),
         Total = Spring_Freq + Summer_Freq) %>% 
  arrange(., desc(Spring_Freq)) %>% 
  select(order(colnames(.)))
  • Output
df
#> A tibble: 4 x 8
#>  Flower Spring_Freq Spring_Perc Spring_Rank Summer_Freq Summer_Perc Summer_Rank Total
#>  <fct>        <dbl>       <dbl>       <int>       <dbl>       <dbl>       <int> <dbl>
#> 1 tulip            2        66.7           1           0         0             3     2
#> 2 daisy            1        33.3           2           0         0             3     1
#> 3 rose             0         0             3           2        66.7           1     2
#> 4 violet           0         0             3           1        33.3           2     1
>

And, please find one possible solution to format the table using the flextable library

library(flextable)

flextable(df) %>% 
  set_header_labels(., Spring_Freq = "Freq", Spring_Perc = "%", Spring_Rank = "Rank", 
                    Summer_Freq = "Freq", Summer_Perc = "%", Summer_Rank ="Rank") %>% 
  add_header_row(., values = c("","Spring", "Summer",""), colwidths = c(1, 2, 3, 2))

enter image description here

Created on 2022-01-04 by the reprex package (v2.0.1)

Upvotes: 2

Related Questions