Ed_Gravy
Ed_Gravy

Reputation: 2033

Split a dataframe into smaller dataframes in R using dplyr

I have a dataframe with 118 observations and has three columns in total. Now, I would like to split this dataframe into two dataframes with 59 observations each. I tried two different approaches but none of them returned what I wanted.

How can I do this using dplyr in R?

Sample dataframe:

Code Count_2020 Count_2021
A    1          2
B    2          4
C    3          6
D    4          8
E    5          10
F    6          12

Desired output:

DF1

Code Count_2020 Count_2021
    A    1          2
    B    2          4
    C    3          6

DF2

Code Count_2020 Count_2021
    D    4          8
    E    5          10
    F    6          12

1st Approach

Based on this answer

library(tidyverse)
df= df %>% group_split(Code)

Now this returns a list of 118, so it's a list of 118 observations where each list has 3 columns.

2nd Approach

Based on this answer

library(tidyverse)
df= df %>% sample_n(size = 59) %>% 
  split(f = as.factor(.$Code))

Now this returns a list of 59.

Upvotes: 5

Views: 4144

Answers (3)

Ronak Shah
Ronak Shah

Reputation: 389325

Here is an option using split -

n <- 3
split(df, ceiling(seq(nrow(df))/n))

#$`1`
#  Code Count_2020 Count_2021
#1    A          1          2
#2    B          2          4
#3    C          3          6

#$`2`
#  Code Count_2020 Count_2021
#4    D          4          8
#5    E          5         10
#6    F          6         12

Upvotes: 1

TarJae
TarJae

Reputation: 79286

We could use slice

library(dplyr)

DF1 <- DF %>% 
    slice(1:3)

DF2 <- DF %>% 
    slice(4:6)

Output:

> DF1
  Code Count_2020 Count_2021
1    A          1          2
2    B          2          4
3    C          3          6
> DF2
  Code Count_2020 Count_2021
1    D          4          8
2    E          5         10
3    F          6         12

Upvotes: 3

akrun
akrun

Reputation: 887951

We may use gl to create the grouping column in group_split

library(dplyr)
df1 %>%
      group_split(grp = as.integer(gl(n(), 59, n())), .keep = FALSE)

Upvotes: 8

Related Questions