Reputation: 202
I'm analyzing data from a solar power plant. I wanted to adjust the estimated production plant Hourly each subsequent day, the data that can be obtained are the weather forecasts of the next three days, so that tomorrow you know what kind of day will be (in scale of 1 to 5, with 1 being sunny and 5 cloudy).
So the idea is to multiply the capacity by a factor, so this is an estimate of what will occur and does not deviate from the actual measurement.
I figured that establishing a linear model by using the varibale type of day as a factor. Could be the best way to approximate the real production.
Today the criteria is:
I made a study of these coefficients and the production of the plant is being underestimated, meaning that there is actually more energy produced, almost 50%. Using solver of excel to find the coeffcient I get:
The trouble is that this is only for this particular case of data and I can not generalize, for it I wanted to make the model).
This is what I have apply:
data <-read.table ("zcinco.txt", dec = ",", header = TRUE)
head (data)
model <- lm (data [-1.2] ~ embed (data [, 2], 2) [, 2] + as.factor (data [-1.3]) + data [1, 4])
head (cbind (matrix (predict (model)), data [-1.2]))
summary (model)
Call:
lm(formula = data[-1, 2] ~ embed(data[, 2], 2)[, 2] + as.factor(data[-1,
3]) + data[-1, 4])
Residuals:
Min 1Q Median 3Q Max
-0.054966 -0.009518 -0.000855 0.010966 0.039100
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.528e-06 1.456e-03 0.002 0.9986
embed(data[, 2], 2)[, 2] 3.870e-01 2.969e-02 13.036 < 2e-16 ***
as.factor(data[-1, 3])2 -2.630e-03 1.407e-03 -1.869 0.0621 .
as.factor(data[-1, 3])3 1.690e-03 2.371e-03 0.713 0.4762
as.factor(data[-1, 3])4 -1.855e-02 2.251e-03 -8.241 1.07e-15 ***
as.factor(data[-1, 3])5 -1.790e-02 2.660e-03 -6.727 4.06e-11 ***
data[-1, 4] 8.930e-01 4.823e-02 18.517 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.01482 on 600 degrees of freedom
Multiple R-squared: 0.795, Adjusted R-squared: 0.793
F-statistic: 387.9 on 6 and 600 DF, p-value: < 2.2e-16
Description of the base.
tiempo / real / tipo / capacidad
Upvotes: 0
Views: 285
Reputation: 508
I would suggest something like this:
m2 <- lm(data[-1, 2] ~ embed(data[, 2], 2)[, 2]:as.factor(data[-1, 3]) + data[-1, 4])
though I'm not sure about why you are ignoring the first row and about the embed usage.
Upvotes: 1