Vikram
Vikram

Reputation: 75

Break dataframe into smaller dataframe's and save them

Need help to split one dataframe dynamically into multiple smaller dataframe’s based on a column interval and save them as well. Example:

x = data.frame(num = 1:26, let = letters, LET = LETTERS)

The above dataframe x needs to split into smaller dataframes based on value in num, in an interval of 5. The result would be 6 dataframes

> 1.    0 – 5
> 2.    6 – 10
> 3.    11 – 15
> 4.    16 -20
> 5.    21 -25
> 6.    26 – 30

Upvotes: 2

Views: 335

Answers (4)

Shinobi_Atobe
Shinobi_Atobe

Reputation: 1963

This seems to be a neater way. You can easily adjust the names of the output files and the number of splits

library(tidyverse)

df <- data.frame(num = 1:26, let = letters, LET = LETTERS)

# split data frame into 6 pieces
split_df <- split(df, ceiling(1:nrow(df) / nrow(df) * 6))

# save each of them in turn
split_df %>%
 names(.) %>%
 walk(~ write_csv(split_df[[.]], paste0("part_", ., ".csv")))

Upvotes: 2

Dave2e
Dave2e

Reputation: 24069

You can use the split function and cut function to perform the operation:

x = data.frame(num = 1:26, let = letters, LET = LETTERS)

answer<-split(x, cut(x$num, breaks=c(0, 5, 10, 15, 20, 25, 30)))

you can then pass this list to lapply for further processing.

Upvotes: 5

Parfait
Parfait

Reputation: 107567

Consider also tagging records by multiples of 5 then running by, the function to split a data frame by one or more factors:

df <- data.frame(num = 1:26, let = letters, LET = LETTERS)

df$grp <- ceiling(df$num / 5)

df_list <- by(df, df$grp, function(sub) transform(sub, grp=NULL))

Output

df_list

# df$grp: 1
#   num let LET
# 1   1   a   A
# 2   2   b   B
# 3   3   c   C
# 4   4   d   D
# 5   5   e   E
# ------------------------------------------------------------------------------------------- 
# df$grp: 2
#    num let LET
# 6    6   f   F
# 7    7   g   G
# 8    8   h   H
# 9    9   i   I
# 10  10   j   J
# ------------------------------------------------------------------------------------------- 
# df$grp: 3
#    num let LET
# 11  11   k   K
# 12  12   l   L
# 13  13   m   M
# 14  14   n   N
# 15  15   o   O
# ------------------------------------------------------------------------------------------- 
# df$grp: 4
#    num let LET
# 16  16   p   P
# 17  17   q   Q
# 18  18   r   R
# 19  19   s   S
# 20  20   t   T
# ------------------------------------------------------------------------------------------- 
# df$grp: 5
#    num let LET
# 21  21   u   U
# 22  22   v   V
# 23  23   w   W
# 24  24   x   X
# 25  25   y   Y
# ------------------------------------------------------------------------------------------- 
# df$grp: 6
#    num let LET
# 26  26   z   Z

Upvotes: 2

Henry Cyranka
Henry Cyranka

Reputation: 3060

Using tidyverse

library(tidyverse)

x = data.frame(num = 1:26, let = letters, LET = LETTERS)


##Brake the data frame
y <- x %>%
  mutate(group = cut_width(num,5, boundary = 0,closed = "right"))

##Put them into a list
list_1 <- lapply(1:length(unique(y$group)),
                function(i)filter(y, group == unique(y$group)[i]))

Upvotes: 2

Related Questions