user17882399
user17882399

Reputation:

Looping a pipe through columns of a tibble

I have a tibble with 20 variables. So far I've been using this pipe to find out which values appear more than once in a single column

as_tibble(iris) %>% group_by(Petal.Length) %>% summarise(n=sum(n())) %>% filter(n>1)

I was wonering if I could write a line that could loop this through all the columns and return 20 different tibbles (or as many as I need in the future) in the same way the pipe above would return one tibble. I have tried writing my own loops but I've had no success, I am quite new.

The iris example dataset has 5 columns so feel free to give an answer with 5 columns.

Thank you!

Upvotes: 0

Views: 259

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 269905

In base R 4.1+ we have this one-liner. For each column it applies table and then filters out those elements whose value exceeds 1. Finally it converts what remains of the table to a data frame. Omit stack if it is ok to return a list of table objects instead of a list of data frames.

lapply(iris, \(x) stack(Filter(function(x) x > 1, table(x))))

A variation of that is to keep only duplicated items and then add 1 giving slightly fewer keystrokes. Again we can omit stack if returning a list of table objects is ok.

lapply(iris, \(x) stack(table(x[duplicated(x)]) + 1))

Upvotes: 0

ozanstats
ozanstats

Reputation: 2864

library(dplyr)

col_names <- colnames(iris)

lapply(
  col_names,
  function(col) {
    iris %>%
      group_by_at(col) %>%
      summarise(n = n()) %>% 
      filter(n > 1)
  }
)

Upvotes: 1

Related Questions