Reputation: 87
I have a table (df) with categorical variables as factors with different levels
A_ID | B_ID | C_ID |
---|---|---|
valid number | valid number | invalid number |
valid number | valid number | invalid number |
invalid number | invalid number | too shot |
too shot | too shot | too shot |
valid number | too long | too shot |
too long | too long | valid number |
invalid number | valid number | too long |
too long | invalid number | too long |
too short | too short | valid number |
too short | valid number | too long |
too long | invalid number | too long |
valid number | invalid number | valid number |
I want to summarize each column based on their number of levels, for example, I want to count the number of times each level occurred in each column, the result should look like the table below
Variable | Count_valid | Count_Invalid | Count_Short | Count_Long |
---|---|---|---|---|
A_ID | 3 | 2 | 3 | 3 |
B_ID | 4 | 4 | 2 | 2 |
C_ID | 3 | 2 | 3 | 4 |
I have tried using apply fuction:
t(sapply(names(df), function(x)
c(count_Valid=count(df[x])== "valid value",
count_Invalid=count(df[x]) == "invalid value",
count_Short=count(df[x] == "too short",
count_Long=count(df[x] == "too long")))))
Upvotes: 0
Views: 189
Reputation: 11584
Does this work:
library(dplyr)
library(tidyr)
df %>% pivot_longer(cols = everything()) %>% count(name, value) %>%
pivot_wider(id_cols = name, names_from = value, values_from = n) %>%
select('Variable' = name, 'Count_valid' = `valid number`, 'Count_Invalid' = `invalid number`, 'Count_Short' = `too short`, 'Count_long' = `too long`)
# A tibble: 3 x 5
Variable Count_valid Count_Invalid Count_Short Count_long
<chr> <int> <int> <int> <int>
1 A_ID 4 2 3 3
2 B_ID 4 4 2 2
3 C_ID 3 2 3 4
Data used:
df
# A tibble: 12 x 3
A_ID B_ID C_ID
<chr> <chr> <chr>
1 valid number valid number invalid number
2 valid number valid number invalid number
3 invalid number invalid number too short
4 too short too short too short
5 valid number too long too short
6 too long too long valid number
7 invalid number valid number too long
8 too long invalid number too long
9 too short too short valid number
10 too short valid number too long
11 too long invalid number too long
12 valid number invalid number valid number
Upvotes: 1