Reputation: 115
ddd = lm('USER ID' ~ 'CREATED ON')
summary(ddd)
The slope of line in second image should be approx. (6000-0)/(2017-2016)=6000 but the slope as shown in first image is 2.204e-04. How does this make sense?
(USER ID and CREATED ON are same as no of users and time as shown in plot)
Upvotes: 0
Views: 462
Reputation: 73415
I generated plot using
plot(Data1$'CREATED ON', Data1$'USER ID', cex = 0.5, xlab = "Time", ylab = "No.Of Users")
thenabline(lm('USER ID'~'CREATED ON', Data1), col=4)
.At time = 2017, No.of Users ~ 6000 and At time = 2016 No.of Users ~ 0 so slope must be
(6000 - 0)/(2017-2016) = 6000
, but the slope shown is in 10^-4 magnitude.
CREATED ON
column is a Date Time type.class(CREATED ON)
gives output"POSIXct" "POSIXt"
Check as.integer(Data1$'CREATED ON')
. Date and DateTime object are integers that can be large.
In general, why not just extract the model matrix to see what columns are?
model.matrix.lm(ddd)
This immediately exposes the problem. Regression coefficients are computed using this model matrix.
Upvotes: 1