matteo
matteo

Reputation: 4873

create data frames for each ID of another data frame

I looked in the question database but I didn't find an answer, so sorry if I miss something. The question is very simple: how can I create new data frames based on a column ID of another one?

If this in the original df:

structure(list(ID = structure(c(12L, 12L, 12L, 12L, 12L, 12L, 
12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 13L, 13L, 13L, 
13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L), .Label = c("B0F", 
"B12T", "B1T", "B21T", "B22F", "B26T", "B2F", "B33F", "B3F", 
"B4F", "B7F", "P1", "P21", "P24", "P25", "P27", "P28", "P29"), class = "factor"), 
    EC = c(953L, 838L, 895L, 2170L, 2140L, 1499L, 2120L, 881L, 
    902L, 870L, 541L, 891L, 876L, 860L, 868L, 877L, 3630L, 3400L, 
    2470L, 2330L, 1810L, 2190L, 2810L, 2200L, 2440L, 1111L, 2460L, 
    2210L, 2340L, 1533L, 880L, 2475L, 2350L, 2440L, 1456L, 2320L, 
    2220L, 2990L, 2240L, 2210L, 2630L)), .Names = c("ID", "EC"
), row.names = 40:80, class = "data.frame")

How can I create two new df, based on the ID? So I can have two new df named B21T and P1 for example? I know I can do it with a subset, but if I have many IDs it would take a lot of time.

So I think that what I'm looking for is a way to automatize the subset function.

Upvotes: 1

Views: 84

Answers (2)

alexis_laz
alexis_laz

Reputation: 13122

You can put all the different subsets you want in a list and, then extract them from there:

#DF <- structure(list(ID = structure(c(12L ...

#all different "ID"s
ids <- as.character(unique(DF$ID))

#create empty list to insert all different subsets
myls <- vector("list", length(ids))

#insert the different subsets
for(i in 1:length(ids)) 
 { 
  myls[[i]] <- DF[DF$ID == ids[i],]
 }

names(myls) <- ids

You can accesss the wanted dataframe:

> myls$P21
    ID   EC
56 P21 3630
57 P21 3400
...

> myls$P1
   ID   EC
40 P1  953
41 P1  838
...

This might take some time, though, if you have really many "ID"s.

EDIT Waaay better than for loop is Jilber's answer. Here used as split(DF, DF$ID, drop = T).

Upvotes: 0

Jilber Urbina
Jilber Urbina

Reputation: 61154

Consider df is your data.frame, then just do:

df$ID <- droplevels(df$ID)  #  drop unused levels from `ID`
list2env(split(df, df$ID), envir=.GlobalEnv)

Upvotes: 1

Related Questions