Chema Sarmiento
Chema Sarmiento

Reputation: 309

Dynamic subsetting depending on values in R

i have a data frame with the next structure

    Id    Flag    value1   value2
    123    1       10        3.4
    124    1        5        1.2
    125    0       19        8.4
    126    1        8        1.2
    127    0       17        6.5
    128    2        1       -6.5

I need to separate the data frame into 'n' subsets depending in only the name of the column, where 'n' is the distinct values of the columns, i would expect the following:

    dataframe1
    Id    Flag    value1   value2
    123    1       10        3.4
    124    1        5        1.2
    126    1        8        1.2

    dataframe2
    Id    Flag    value1   value2
    125    0       19        8.4
    127    0       17        6.5

    dataframe3
    Id    Flag    value1   value2
    128    2        1       -6.5

Since this is going inside a function, I only know the name of the column and the distinct values it can take, I've tried:

    dataFrame$column==value

but I would need to do this for every value, and the values are dynamic in length depending on the name of the column.

Thanks in advance

Upvotes: 2

Views: 309

Answers (2)

Jilber Urbina
Jilber Urbina

Reputation: 61164

Another approach avoiding for loop

> List <- split(df, df$Flag)                            # split 
> names(List) <- paste0("dataframe", seq_along(List))   # naming (use seq_along better)
> list2env(List, envir=.GlobalEnv)                      # from list to data.frame

> dataframe1
#    Id Flag value1 value2
#3 125    0     19    8.4
#5 127    0     17    6.5
> dataframe2
#   Id Flag value1 value2
#1 123    1     10    3.4
#2 124    1      5    1.2
#4 126    1      8    1.2
> dataframe3
#    Id Flag value1 value2
# 6 128    2      1   -6.5

Upvotes: 1

gagolews
gagolews

Reputation: 13056

Here, split is your friend.

splitbycol <- function(df, colname) {
   split(df, df[[colname]])
}

splitbycol(df, "Flag")
## $`0`
##    Id Flag value1 value2
## 3 125    0     19    8.4
## 5 127    0     17    6.5
## 
## $`1`
##    Id Flag value1 value2
## 1 123    1     10    3.4
## 2 124    1      5    1.2
## 4 126    1      8    1.2
## 
## $`2`
##    Id Flag value1 value2
## 6 128    2      1   -6.5

Then, if you'd like to make each of the data frames a separate "variable", call e.g.

subdf <- splitbycol(df, "Flag")
for (i in seq_along(subdf))
   assign(paste0("df", i), subdf[[i]])
df1
##    Id Flag value1 value2
## 3 125    0     19    8.4
## 5 127    0     17    6.5

Upvotes: 3

Related Questions