Edward
Edward

Reputation: 4623

How to apply scale rule for many columns in new dataset

I have a next task

a = data.frame(a= c(1,2,3,4,5,6)) # dataset
range01 <- function(x){(x-min(a$a))/(max(a$a)-min(a$a))} # rule for scale
b = data.frame(a = 6) # newdaset
lapply(b$a, range01) # we can apply range01 for this dataset because we use min(a$a) in the rule

But how can I apply this when i have many columns in my dataset? like below

a = data.frame(a= c(1,2,3,4,5,6))
b = data.frame(b= c(1,2,3,3,2,1))
c = data.frame(c= c(6,2,4,4,5,6))
df = cbind(a,b,c)
df
new = data.frame(a = 1, b = 2, c = 3)

Of course I can make rules for every variable

range01a <- function(x){(x-min(df$a))/(max(df$a)-min(df$a))}

But it's very long way. How to make it convenient?

Upvotes: 0

Views: 406

Answers (2)

d.b
d.b

Reputation: 32558

You can exploit the fact that the column names of new and df are same. Could be helpful if the order of the columns in the two dataframes is not the same.

sapply(names(new), function(x) (new[x]-min(df[x]))/(max(df[x])-min(df[x])))
#$a.a
#[1] 0

#$b.b
#[1] 0.5

#$c.c
#[1] 0.25

to put in data.frame

data.frame(lapply(names(new), function(x) (new[x]-min(df[x]))/(max(df[x])-min(df[x]))))
#  a   b    c
#1 0 0.5 0.25

Upvotes: 1

akuiper
akuiper

Reputation: 215107

You can redefine your scale function so it takes two arguments; One to be scaled and one the scaler as follows, and then use Map on the two data frames:

scale_custom <- function(x, scaler) (x - min(scaler)) / (max(scaler) - min(scaler))

Map(scale_custom, new, df)
#$a
#[1] 0

#$b
#[1] 0.5

#$c
#[1] 0.25

If you need the data frame as result:

as.data.frame(Map(scale_custom, new, df))
#  a   b    c
#1 0 0.5 0.25

Upvotes: 1

Related Questions