Reputation: 59
I am new to R and I need help getting some values from my data set. The information is dollar amounts per each year for a list of cities. I'm trying to setup my values so that I can run a linear regression model on the entire dataset names estimates.
estimate <- read.csv("estimate.csv", check.names = FALSE) #Import
estimate
location 2010 2011 2012 2013 2014
city1 200 250 300 500 600
city2 300 300 400 650 780
city3 500 600 700 800 900
I am only interested in the data for city3 for the years show.
I know I can just use the code years <- c(2010,2011,2012,2013,2014)
to create my years variable, but I know that is only practical for small tables.
For my linear model I would like to first plot(years, values)
where the years are columns 2:6 and the values that correspond are from row 3 only. When I run values <- estimate[3, c(3,2:6]
I get the data for the values but when I try to do the same thing for years <- estimate[0, c(0,2:6)]
I get a 0 object of 5 variables. Trying to plot that gives me
Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In min(x) : no non-missing arguments to min; returning Inf
2: In max(x) : no non-missing arguments to max; returning -Inf
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -In
Ideally I would like the data setup where:
years values
2010 500
2011 600
2012 700
2013 800
2014 900
And I can then run an lm function. Thanks ahead of time. I'm real new at this stuff in R and on Stack so please forgive my newbishness.
Upvotes: 2
Views: 80
Reputation: 270055
1) extraction Assuming the data shown reproducibly in the Note at the end we can perform the regression like this:
year <- as.numeric(names(estimate)[-1])
city3 <- unlist((estimate[3, -1]))
lm(city3 ~ year)
2) melt or we can convert estimate
to long form, here 15x3, and then fix up names and make Year numeric and then perform the regression:
library(reshape2)
long <- melt(estimate, id = "Location")
names(long) <- c("Location", "Year", "Estimate")
long$Year <- as.numeric(as.character(long$Year))
lm(Estimate ~ Year, long, subset = Location == "city3")
2a) reshape Converting from wide to long form could also be done without any packages like this:
yrs <- names(estimate)[-1]
long <- reshape(estimate, dir = "long", idvar = "Location",
varying = list(yrs), times = as.numeric(yrs), timevar = "Year", v.names = "Estimate")
lm(Estimate ~ Year, long, subset = Location == "city3")
Note:
Lines <- "
Location,2010,2011,2012,2013,2014
city1,200,250,300,500,600
city2,300,300,400,650,780
city3,500,600,700,800,900"
estimate <- read.csv(text = Lines, check.names = FALSE)
Upvotes: 1
Reputation: 5068
When you read csv files with read.csv
, the first row becomes the names in your data frame. Try
names = colnames(estimate)
You'll see that names
is a character vector c("location", "2010", "2011", ...)
. You can translate this to years
by dropping the first item and converting to numeric:
years = as.numeric(names[-1])
Upvotes: 0