Reputation: 199
How can I create a cross table in R (RStudio), where I count occurrences.
I have this sample input:
Technology <- c("A", "A", "B", "C", "C", "C")
Development <- c(1, 0, 1, 1, 1, 1)
Production <- c(1, 1, 0, 0, 0, 1)
Sales <- c(0, 0, 1, 1, 0, 1)
DF <- data.frame(Technology, Development, Production, Sales)
I want to know in which domain which technology is used most often.
The result should look like in the picture.
Upvotes: 3
Views: 744
Reputation: 8523
Since you asked for a crosstable, you can also use the package crosstable
for that:
library(crosstable)
crosstable(DF, by=Technology)%>%
as_flextable()
However, in your case, you don't care about proportions and you only need the numbers when each variable is 1
, so you might want to run instead:
library(dplyr)
crosstable(DF, by=Technology, percent_pattern="{n}") %>%
filter(variable==1) %>% select(-variable) %>%
as_flextable()
More info about the package at https://danchaltiel.github.io/crosstable/.
Upvotes: 0
Reputation: 79246
Here is a tidyverse
approach, to get your desired output:
across
paste
and apply column_to_rownames
from tibble
t()
library(dplyr)
library(tibble)
DF %>%
group_by(Technology) %>%
summarise(across(c(Development, Production, Sales), sum)) %>%
mutate(Technology = paste("Technology", Technology, sep = " ")) %>%
column_to_rownames("Technology") %>%
t()
Technology A Technology B Technology C
Development 1 1 3
Production 2 0 1
Sales 0 1 2
Upvotes: 1
Reputation: 76663
These problems are many times a data format problem and the solution is to reshape from wide to long format first, see this question.
Here is a base R solution with reshape
and cross tabulation with xtabs
.
Technology <- c("A", "A", "B", "C", "C", "C")
Development <- c(1, 0, 1, 1, 1, 1)
Production <- c(1, 1, 0, 0, 0, 1)
Sales <- c(0, 0, 1, 1, 0, 1)
DF <- data.frame(Technology, Development, Production, Sales)
reshape(
DF,
direction = "long",
varying = list(names(DF[-1])),
v.names = "Active",
times = names(DF[-1]),
timevar = "Phase"
) |>
(\(x) xtabs(Active ~ Phase + Technology, x))()
#> Technology
#> Phase A B C
#> Development 1 1 3
#> Production 2 0 1
#> Sales 0 1 2
Created on 2022-04-18 by the reprex package (v2.0.1)
And a tidyverse
solution.
suppressPackageStartupMessages({
library(magrittr)
library(tidyr)
})
DF %>%
pivot_longer(-Technology) %>%
xtabs(value ~ name + Technology, .)
#> Technology
#> name A B C
#> Development 1 1 3
#> Production 2 0 1
#> Sales 0 1 2
Created on 2022-04-18 by the reprex package (v2.0.1)
Upvotes: 3