user3689870
user3689870

Reputation: 37

Creating multiple dataframes simultaneously based on different values in one column

I want to create multiple dataframes based on values in a column.

sample data

df
Index Product ID Amount
200   Prod1   01 100
201   Prod1   01 150
202   Prod1   01 123
203   Prod1   01 123
204   Prod1   02 110
205   Prod1   02 175
206   Prod1   02 190
207   Prod2   03 120
208   Prod2   03 135
209   Prod2   03 150

I would like to add a column as Base for each ID. The value of Base is the first amount value in each ID.

>df1
Index Product ID Amount Base
200   Prod1   01 100    100
201   Prod1   01 150    100
202   Prod1   01 123    100
203   Prod1   01 123    100
204   Prod1   02 110    110
205   Prod1   02 175    110
206   Prod1   02 190    110
207   Prod2   03 120    120
208   Prod2   03 135    120
209   Prod2   03 150    120

I am thinking of subset the df by ID first. Just wondering if any way to do this?

Upvotes: 0

Views: 167

Answers (5)

David Arenburg
David Arenburg

Reputation: 92292

You could create a list of data frames and then embed them into any environment you want using list2env

SubData <- lapply(unique(df$ID), function(x) cbind(df[df$ID == x, ], Base = df$Amount[df$ID == x][1]))


# [[1]]
#   Index Product ID Amount Base
# 1   200   Prod1  1    100  100
# 2   201   Prod1  1    150  100
# 3   202   Prod1  1    123  100
# 4   203   Prod1  1    123  100
# 
# [[2]]
#   Index Product ID Amount Base
# 5   204   Prod1  2    110  110
# 6   205   Prod1  2    175  110
# 7   206   Prod1  2    190  110
# 
# [[3]]
#    Index Product ID Amount Base
# 8    207   Prod2  3    120  120
# 9    208   Prod2  3    135  120
# 10   209   Prod2  3    150  120

Now give your data frames whatever names you want and use list2env in order to create them in the environment

names(SubData) <- c("df1", "df2", "df3")
list2env(SubData, envir = .GlobalEnv)

Now you have these datasets in the global enviroment, e.g.

df1
##   Index Product ID Amount Base
## 1   200   Prod1  1    100  100
## 2   201   Prod1  1    150  100
## 3   202   Prod1  1    123  100
## 4   203   Prod1  1    123  100

Upvotes: 7

agstudy
agstudy

Reputation: 121568

Using ave:

dat$Base <- ave(dat$Amount,dat$ID,FUN=min)

# Index Product ID Amount Base
# 1    200   Prod1  1    100  100
# 2    201   Prod1  1    150  100
# 3    202   Prod1  1    123  100
# 4    203   Prod1  1    123  100
# 5    204   Prod1  2    110  110
# 6    205   Prod1  2    175  110
# 7    206   Prod1  2    190  110
# 8    207   Prod2  3    120  120
# 9    208   Prod2  3    135  120
# 10   209   Prod2  3    150  120

EDIT

In case you want the first value and the minium one:

dat$Base <- ave(dat$Amount,dat$ID,FUN=function(x)x[1])

Upvotes: 4

Arun
Arun

Reputation: 118799

Assuming your data.frame is called dat, here's a data.table solution:

require(data.table)
setDT(dat)[, Base := Amount[1L], by=ID]
#    Index Product ID Amount Base
#  1:   200   Prod1  1    100  100
#  2:   201   Prod1  1    150  100
#  3:   202   Prod1  1    123  100
#  4:   203   Prod1  1    123  100
#  5:   204   Prod1  2    110  110
#  6:   205   Prod1  2    175  110
#  7:   206   Prod1  2    190  110
#  8:   207   Prod2  3    120  120
#  9:   208   Prod2  3    135  120
# 10:   209   Prod2  3    150  120

Upvotes: 3

lukeA
lukeA

Reputation: 54237

Or using dplyr:

library(dplyr)
df1 <- df %>% 
  arrange(ID, Amount) %>%
  group_by(ID) %>% 
  mutate(Base = Amount[1])

Upvotes: 1

talat
talat

Reputation: 70266

You could use dplyr to create the Base column, but just to be clear, this does not yet create different data.frames (as indicated in your question).

require(dplyr)

df <- df %.% group_by(ID) %.% mutate(Base = first(Amount))

Upvotes: 2

Related Questions