Adamm
Adamm

Reputation: 2306

Number of occurrences of each column (summary) in custom formatted way

I need help with my data:

example:

Input  =("  v1  v2  v3
1   a   a   b
2   a   a   a
3   b   s   <NA>
4   b   f   s
5   c   s   b
6   c   p   b
7   d   b   c
8   d   g   g
")
df = as.data.frame(read.table(textConnection(Input), header = T, row.names = 1))

I'd like to achieve something like this:

    v1  v2  v3
1   a   a   b
2   a   a   a
3   b   s   <NA>
4   b   f   s
5   c   s   b
6   c   p   b
7   d   b   c
8   d   g   g
Summary 2:a;2:b,2:c;2:d 2:a;2:s;1:f;1:p;1:b;1:g 3:b;1:a;1:NA1:s;1:c;1:g

At the end of each column, the summary from table() in this or similar format. Unfortunately when I put table() into sapply for several columns I have awful list of tables hard to convert to anything so I did:

p <- as.data.frame(1:ncol(df), function(x) plyr::count(res[,x])))

But it also gives some inconvenient format. Could you give me a hint how to do this in easy and elegant way?

Upvotes: 0

Views: 47

Answers (2)

Sotos
Sotos

Reputation: 51592

Not sure how you want the format, but here is a dplyr idea,

library(dplyr)
library(tidyr)

df %>% 
 pivot_longer(everything()) %>% 
 count(name, value) %>% 
 group_by(name) %>% 
 summarise(res = paste0(value, ':', n, collapse = ' ')) %>% 
 ungroup() %>% 
 pivot_wider(names_from = name, values_from = res) %>% 
 bind_rows(df, .)

#`summarise()` ungrouping output (override with `.groups` argument)
#                  v1                      v2                         v3
#1                  a                       a                          b
#2                  a                       a                          a
#3                  b                       s                       <NA>
#4                  b                       f                          s
#5                  c                       s                          b
#6                  c                       p                          b
#7                  d                       b                          c
#8                  d                       g                          g
#9    a:2 b:2 c:2 d:2 a:2 b:1 f:1 g:1 p:1 s:2 <NA>:1 a:1 b:3 c:1 g:1 s:1

Upvotes: 1

s_baldur
s_baldur

Reputation: 33508

paste(
  sapply(
    df,
    function(x) {
      y <- table(x)
      paste(paste(names(y), y, sep = ":"), collapse=";")
    }
  ),
  collapse = " "
)
# 1] "a:2;b:2;c:2;d:2 a:2;b:1;f:1;g:1;p:1;s:2 <NA>:1;a:1;b:3;c:1;g:1;s:1"

Upvotes: 1

Related Questions