Creating Groups based on Column Position

Question

Good afternoon!

I think this is pretty straight forward question, but I think I am missing a couple of steps. Would like to create groups based on column position.

Am working with a dataframe / tibble; 33 rows long, and 66 columns wide. However, every sequence of 6 columns, should really be separated into its own sub-dataframe / tibble.

The sequence of the number columns is arbitrary to the dataframe. Below is an attempt with mtcars, where I am trying to group every 2 columns into its own sub-dataframe.

mtcars %>% 
  tibble() %>% 
  group_by(across(seq(1,2, length.out = 11))) %>% 
  nest()

However, that method generates errors. Something similar applies when working just within nest() as well.

Using mtcars, would like to create groups using a sequence for every 3 columns, or some other number.

Would ultimately like the mtcars dataframe to be...

Columns 1:3 to be group 1,
Columns 4:6 to be group 2,
Columns 7:9 to be group 3, etc... while retaining the information for the rows in each column.

Also considered something with pivot_longer...

mtcars %>% 
  tibble() %>% 
  pivot_longer(cols = seq(1,3, by = 1))

...but that did not generate defined groups, or continue the sequencing along all columns of the dataframe.

Hope one of you can help me with this! Would make certain tasks for work much easier.

PS - A plus if you can keep the workflow to tidyverse centric code :)

AndS. · Accepted Answer

You could try this. It splits the dataframe into a list of dataframes based on the number of columns you want (3 in your example):

library(tidyverse)

list_of_dataframes <- mtcars %>%
  tibble() %>%
  mutate(row = row_number()) %>%
  pivot_longer(-row) %>%
  group_by(row) %>%
  mutate(group = ceiling(row_number()/ 3)) %>%
  ungroup() %>%
  group_split(group) %>%
  map(
    ~select(.x, row, name, value) %>%
      pivot_wider()
    )

EDIT

Here, based on comments from the question asker, we will avoid pivoting the data. Instead, we map the groups across the dataframe.

list_of_dataframes <- map(seq(1, ncol(mtcars), by = 3),
     ~mtcars %>%
       as_tibble() %>%
       select(all_of(.x:min(c(.x+2, ncol(mtcars))))))

We can then wrap this in a function to make it a little easier to use and change group sizes and dataframes:


group_split_cols <- function(.data, ncols_per_group){
  map(seq(1, ncol(.data), by = ncols_per_group),
     ~.data %>%
       as_tibble() %>%
       select(all_of(.x:min(c(.x+ncols_per_group-1, ncol(.data))))))
}


list_of_dataframes <- group_split_cols(.data = mtcars, ncols_per_group = 3)

Creating Groups based on Column Position

Answers (1)

Related Questions