Reputation: 867
I am trying to split a dataframe, create a new variable in each dataframe list object, and reassemble (unsplit) the original dataframe.
The new variable I am trying to create scales the variable B.2
from 0 to 1 for each factor level in the variable Type
.
BWRX$B.2 <- BWRX$B #Create a new version of B
BWRX.Split <- split(BWRX, BWRX$Type) #Split by Type
BWRX.Split.BScaled <-lapply(BWRX.Split, function(df){df$B.3 <- (df$B.2-min(df$B.2))/(max(df$B.2)-min(df$B.2))}) #Scale B.2
The above code returns a list with the values of B.2
correctly scaled within each factor level. The tricky part is that I cannot figure out how to add this variable to each dataframe in BWRX.Split
.
I thought df$B.3
would correct for this, but it has not. Once B.3
is a part of each dataframe can unsplit(, Type)
be used to reassemble the dataframes or would do.call
be better? I was trying to combine unsplit and split so everything would be in one line to code. Is there a more efficient method?
Upvotes: 0
Views: 114
Reputation: 19544
As you mentioned and MrFlick confirmed, you can simply unsplit()
it:
BWRX$B.3 <- unsplit(BWRX.Split.BScaled,BWRX$Type)
To do this in a single line:
BWRX$B.3 <- unsplit(lapply(split(BWRX$B.2, BWRX$Type), function(x)(x-min(x))/(max(x)-min(x))),BWRX$Type)
But Akrun's solutions are both quicker
Upvotes: 0
Reputation: 887148
We don't really need to split
it, this can be done using ave
from base R
. The advantage is that the new column will added in the same order as in the original row order of the dataset.
transform(BWRX, BScaled = ave(B.2, Type,
FUN = function(x) (x- min(x))/(max(x)- min(x))))
This is a group by operation. So, it can be efficiently done with data.table
or dplyr
library(data.table)
setDT(BWRX)[, BScaled := (B.2 - min(B.2))/(max(B.2) - min(B.2)), by = Type]
Upvotes: 1