sharon paul
sharon paul

Reputation: 93

prediction applied to whole data

HI i am doing prediction with my data.if i use data.frame it throws the folloing error.

input(bedrooms="2",bathrooms="2",area="1000") were specified with different   types from the fit

here is my program

input <- function(bedrooms,bathrooms,area)
{
delhi <- read.delim("delhi.tsv", na.strings = "") 
delhi$lnprice <- log(delhi$price)
heddel <- lm(lnprice ~ bedrooms+ area+ bathrooms,data=delhi)
valuepred = predict (heddel,data.frame(bedrooms=bedrooms,area=area,bathrooms=bathrooms),na.rm = TRUE)
final_prediction = exp(valuepred)
final_prediction
}

if i remove the data.frame it predicts the value for over all data.i got the following output.

       1          2          3          4          5          6          7 
  15480952   11657414   10956873    6011639    6531880    9801468   16157549 
         9         10         11         14         15         16         17 
  10698786    5596803   14688143   20339651   22012831   16157618   26644246 

but it needs to display one value only.

any idea how to resolve this..any help will be appreciated

Upvotes: 0

Views: 242

Answers (2)

jlhoward
jlhoward

Reputation: 59395

Too long for a comment.

The other answer should solve your problem, but if you really believe that log(price) is linear in bedrooms + bathrooms + area then you are better off with a generalized linear model (glm) in the poisson family. So something like:

fit <- glm(price~bedrooms+bathrooms+area, dehli, family=poisson)

Then predict using type="response"

pred <- predict(fit, data.frame(bedrooms, bathrooms, area), type="response")

Upvotes: 0

Kieran Martin
Kieran Martin

Reputation: 127

Sharon, you want to make a prediction for the specific values of bedroom, bathroom and area, but are putting them in as character rather than numeric values. This is causing the error you are seeing. when you remove the data.frame statement from predict, it will produce predictions based on the data set used to build the model, i.e. delhi.

Try

input(bedrooms=2,bathrooms=2,area=1000)

Upvotes: 1

Related Questions