Reputation: 75
Need help to split one dataframe dynamically into multiple smaller dataframe’s based on a column interval and save them as well. Example:
x = data.frame(num = 1:26, let = letters, LET = LETTERS)
The above dataframe x needs to split into smaller dataframes based on value in num, in an interval of 5. The result would be 6 dataframes
> 1. 0 – 5
> 2. 6 – 10
> 3. 11 – 15
> 4. 16 -20
> 5. 21 -25
> 6. 26 – 30
Upvotes: 2
Views: 335
Reputation: 1963
This seems to be a neater way. You can easily adjust the names of the output files and the number of splits
library(tidyverse)
df <- data.frame(num = 1:26, let = letters, LET = LETTERS)
# split data frame into 6 pieces
split_df <- split(df, ceiling(1:nrow(df) / nrow(df) * 6))
# save each of them in turn
split_df %>%
names(.) %>%
walk(~ write_csv(split_df[[.]], paste0("part_", ., ".csv")))
Upvotes: 2
Reputation: 24069
You can use the split
function and cut
function to perform the operation:
x = data.frame(num = 1:26, let = letters, LET = LETTERS)
answer<-split(x, cut(x$num, breaks=c(0, 5, 10, 15, 20, 25, 30)))
you can then pass this list to lapply
for further processing.
Upvotes: 5
Reputation: 107567
Consider also tagging records by multiples of 5 then running by
, the function to split a data frame by one or more factors:
df <- data.frame(num = 1:26, let = letters, LET = LETTERS)
df$grp <- ceiling(df$num / 5)
df_list <- by(df, df$grp, function(sub) transform(sub, grp=NULL))
Output
df_list
# df$grp: 1
# num let LET
# 1 1 a A
# 2 2 b B
# 3 3 c C
# 4 4 d D
# 5 5 e E
# -------------------------------------------------------------------------------------------
# df$grp: 2
# num let LET
# 6 6 f F
# 7 7 g G
# 8 8 h H
# 9 9 i I
# 10 10 j J
# -------------------------------------------------------------------------------------------
# df$grp: 3
# num let LET
# 11 11 k K
# 12 12 l L
# 13 13 m M
# 14 14 n N
# 15 15 o O
# -------------------------------------------------------------------------------------------
# df$grp: 4
# num let LET
# 16 16 p P
# 17 17 q Q
# 18 18 r R
# 19 19 s S
# 20 20 t T
# -------------------------------------------------------------------------------------------
# df$grp: 5
# num let LET
# 21 21 u U
# 22 22 v V
# 23 23 w W
# 24 24 x X
# 25 25 y Y
# -------------------------------------------------------------------------------------------
# df$grp: 6
# num let LET
# 26 26 z Z
Upvotes: 2
Reputation: 3060
Using tidyverse
library(tidyverse)
x = data.frame(num = 1:26, let = letters, LET = LETTERS)
##Brake the data frame
y <- x %>%
mutate(group = cut_width(num,5, boundary = 0,closed = "right"))
##Put them into a list
list_1 <- lapply(1:length(unique(y$group)),
function(i)filter(y, group == unique(y$group)[i]))
Upvotes: 2