Mec-Eng
Mec-Eng

Reputation: 199

How to count Occurrences in a cross table in R?

How can I create a cross table in R (RStudio), where I count occurrences.

I have this sample input:

Technology <- c("A", "A", "B", "C", "C", "C")
Development <- c(1, 0, 1, 1, 1, 1)
Production <- c(1, 1, 0, 0, 0, 1)
Sales <- c(0, 0, 1, 1, 0, 1)
DF <- data.frame(Technology, Development, Production, Sales)

I want to know in which domain which technology is used most often.

The result should look like in the picture.

enter image description here

Upvotes: 3

Views: 744

Answers (3)

Dan Chaltiel
Dan Chaltiel

Reputation: 8523

Since you asked for a crosstable, you can also use the package crosstable for that:

library(crosstable)
crosstable(DF, by=Technology)%>% 
  as_flextable()

enter image description here

However, in your case, you don't care about proportions and you only need the numbers when each variable is 1, so you might want to run instead:

library(dplyr)
crosstable(DF, by=Technology, percent_pattern="{n}") %>% 
    filter(variable==1) %>% select(-variable) %>% 
    as_flextable()

enter image description here

More info about the package at https://danchaltiel.github.io/crosstable/.

Upvotes: 0

TarJae
TarJae

Reputation: 79246

Here is a tidyverse approach, to get your desired output:

  1. We group by Technology to summarise with across
  2. then we prepare the rownames with paste and apply column_to_rownames from tibble
  3. finally we could transform with t()
library(dplyr)
library(tibble)
DF %>% 
  group_by(Technology) %>% 
  summarise(across(c(Development, Production, Sales), sum)) %>% 
  mutate(Technology = paste("Technology", Technology, sep = " ")) %>% 
  column_to_rownames("Technology") %>% 
  t()
            Technology A Technology B Technology C
Development            1            1            3
Production             2            0            1
Sales                  0            1            2

Upvotes: 1

Rui Barradas
Rui Barradas

Reputation: 76663

These problems are many times a data format problem and the solution is to reshape from wide to long format first, see this question.


Here is a base R solution with reshape and cross tabulation with xtabs.

Technology <- c("A", "A", "B", "C", "C", "C")
Development <- c(1, 0, 1, 1, 1, 1)
Production <- c(1, 1, 0, 0, 0, 1)
Sales <- c(0, 0, 1, 1, 0, 1)
DF <- data.frame(Technology, Development, Production, Sales)

reshape(
  DF,
  direction = "long",
  varying = list(names(DF[-1])),
  v.names = "Active",
  times = names(DF[-1]),
  timevar = "Phase"
) |>
  (\(x) xtabs(Active ~ Phase + Technology, x))()
#>              Technology
#> Phase         A B C
#>   Development 1 1 3
#>   Production  2 0 1
#>   Sales       0 1 2

Created on 2022-04-18 by the reprex package (v2.0.1)


And a tidyverse solution.

suppressPackageStartupMessages({
  library(magrittr)
  library(tidyr)
})

DF %>%
  pivot_longer(-Technology) %>%
  xtabs(value ~ name + Technology, .)
#>              Technology
#> name          A B C
#>   Development 1 1 3
#>   Production  2 0 1
#>   Sales       0 1 2

Created on 2022-04-18 by the reprex package (v2.0.1)

Upvotes: 3

Related Questions