Reputation: 37
I want to create multiple dataframes based on values in a column.
sample data
df Index Product ID Amount 200 Prod1 01 100 201 Prod1 01 150 202 Prod1 01 123 203 Prod1 01 123 204 Prod1 02 110 205 Prod1 02 175 206 Prod1 02 190 207 Prod2 03 120 208 Prod2 03 135 209 Prod2 03 150
I would like to add a column as Base for each ID. The value of Base is the first amount value in each ID.
>df1
Index Product ID Amount Base
200 Prod1 01 100 100
201 Prod1 01 150 100
202 Prod1 01 123 100
203 Prod1 01 123 100
204 Prod1 02 110 110
205 Prod1 02 175 110
206 Prod1 02 190 110
207 Prod2 03 120 120
208 Prod2 03 135 120
209 Prod2 03 150 120
I am thinking of subset the df by ID first. Just wondering if any way to do this?
Upvotes: 0
Views: 167
Reputation: 92292
You could create a list of data frames and then embed them into any environment you want using list2env
SubData <- lapply(unique(df$ID), function(x) cbind(df[df$ID == x, ], Base = df$Amount[df$ID == x][1]))
# [[1]]
# Index Product ID Amount Base
# 1 200 Prod1 1 100 100
# 2 201 Prod1 1 150 100
# 3 202 Prod1 1 123 100
# 4 203 Prod1 1 123 100
#
# [[2]]
# Index Product ID Amount Base
# 5 204 Prod1 2 110 110
# 6 205 Prod1 2 175 110
# 7 206 Prod1 2 190 110
#
# [[3]]
# Index Product ID Amount Base
# 8 207 Prod2 3 120 120
# 9 208 Prod2 3 135 120
# 10 209 Prod2 3 150 120
Now give your data frames whatever names you want and use list2env
in order to create them in the environment
names(SubData) <- c("df1", "df2", "df3")
list2env(SubData, envir = .GlobalEnv)
Now you have these datasets in the global enviroment, e.g.
df1
## Index Product ID Amount Base
## 1 200 Prod1 1 100 100
## 2 201 Prod1 1 150 100
## 3 202 Prod1 1 123 100
## 4 203 Prod1 1 123 100
Upvotes: 7
Reputation: 121568
Using ave
:
dat$Base <- ave(dat$Amount,dat$ID,FUN=min)
# Index Product ID Amount Base
# 1 200 Prod1 1 100 100
# 2 201 Prod1 1 150 100
# 3 202 Prod1 1 123 100
# 4 203 Prod1 1 123 100
# 5 204 Prod1 2 110 110
# 6 205 Prod1 2 175 110
# 7 206 Prod1 2 190 110
# 8 207 Prod2 3 120 120
# 9 208 Prod2 3 135 120
# 10 209 Prod2 3 150 120
In case you want the first value and the minium one:
dat$Base <- ave(dat$Amount,dat$ID,FUN=function(x)x[1])
Upvotes: 4
Reputation: 118799
Assuming your data.frame is called dat
, here's a data.table
solution:
require(data.table)
setDT(dat)[, Base := Amount[1L], by=ID]
# Index Product ID Amount Base
# 1: 200 Prod1 1 100 100
# 2: 201 Prod1 1 150 100
# 3: 202 Prod1 1 123 100
# 4: 203 Prod1 1 123 100
# 5: 204 Prod1 2 110 110
# 6: 205 Prod1 2 175 110
# 7: 206 Prod1 2 190 110
# 8: 207 Prod2 3 120 120
# 9: 208 Prod2 3 135 120
# 10: 209 Prod2 3 150 120
Upvotes: 3
Reputation: 54237
Or using dplyr
:
library(dplyr)
df1 <- df %>%
arrange(ID, Amount) %>%
group_by(ID) %>%
mutate(Base = Amount[1])
Upvotes: 1
Reputation: 70266
You could use dplyr
to create the Base
column, but just to be clear, this does not yet create different data.frames (as indicated in your question).
require(dplyr)
df <- df %.% group_by(ID) %.% mutate(Base = first(Amount))
Upvotes: 2