Reputation: 15
So I have created a linear model of times against the number of people booked into those times. For both pieces of data, I normalized the data into the 0-1 range as is widely used and plotted using lm() with bookings being what I want to predict using the times.
But now I want to predict what bookings might be for later times of day. I need to normalize them too but I'm not sure which way. Do I normalize them on their own or should I include them in the original time data I normalized at the start before predicting. I think both will give back different normalized values which would affect my prediction.
So basically, which way should the new times be normalized, on their own or as part of the original time data?
Upvotes: 0
Views: 401
Reputation: 226557
Normalize the data according to the min and max of the original data. In order to do this, you must have retained the min and max values of the original data; if you transformed the original variables to a [0,1] scale in place, discarding the original data, you're stuck.
To incorporate your comment: if your original predictor was x0
and you used
(x0-min(x0))/(max(x0)-min(x0))
to transform the data for analysis, you would use
(x1-min(x0))/(max(x0)-min(x0))
to transform your new variable x1
for prediction (assuming you didn't replace x0
with its scaled version!)
The built-in scale()
function attaches attributes of the original data to the transformed data, which are helpful in similarly transforming other data sets (or back-transforming). (Confusingly, the function labels the value that was subtracted from the original value center
; scale
is the value by which the shifted value was divided. In your case, center
is min(x)
, while scale
is max(x)-min(x)
= diff(range(x))
.)
dd <- data.frame(x=1:10)
scalefun <- function(x) drop(scale(x,center=min(x),scale=diff(range(x))))
dd <- transform(dd,x=scalefun(x))
Function to back-transform
unscalefun <- function(x,orig=x) {
c(x*attr(orig,"scaled:scale") + attr(orig,"scaled:center"))
}
Function to transform according to another data set:
rescalefun <- function(x,orig=x) {
scale(x,scale=attr(orig,"scaled:scale"),center=attr(orig,"scaled:center"))
}
rescalefun(1:20)
Upvotes: 1