jackw19
jackw19

Reputation: 375

R X-axis Date Labels using plot()

Using the plot() function in R, I'm trying to produce a scatterplot of points of the form (SaleDate,SalePrice) = (saldt,sapPr) from a time-series, cross-section real estate sales dataset in dataframe format. My problem concerns labels for the X-axis. Just about any series of annual labels would be adequate, e.g. 1999,2000,...,2013 or 1999-01-01,...,2013-01-01. What I'm getting now, a single label, 2000, at what appears to be the proper location won't work.

The following is my call to plot():

plot(r12rgr0$saldt, r12rgr0$salpr/1000, type="p", pch=20, col="blue", cex.axis=.75, 
     xlim=c(as.Date("1999-01-01"),as.Date("2014-01-01")),
     ylim=c(100,650), 
     main="Heritage Square Sales Prices $000s 1990-2014",xlab="Sale Date",ylab="$000s")

The xlim and ylim are called out to bound the date and price ranges of the data to be plotted; note prices are plotted as $000s. r12rgr0$saldt really is a date; str(r12rgr0$saldt) returns:

Date[1:4190], format: "1999-10-26" "2013-07-06" "2003-08-25" NA NA "2000-05-24"  xx 

I have reviewed several threads here concerning similar questions, and see that the solution probably lies with turning off the default X-axis behavior and using axis.date, but i) At my current level of R skill, I'm not sure I'd be able to solve the problem, and ii) I wonder why the plotting defaults are producing these rather puzzling (to me, at least) results?

Addl Observations: The Y-axis labels are just fine 100, 200,..., 600. The general appearance of the scatterplot indicates the called-for date ranges are being observed and the relative positions of the plotted points are correct. Replacing xlim=... as above with xlim=c("1999-01-01","2014-01-01")

or

xlim=c(as.numeric(as.character("1999-01-01")),as.numeric(as.character("2014-01-01")))

or

xlim=c(as.POSIXct("1999-01-01", format="%Y-%m-%d"),as.POSIXct("2014-01-01", format="%Y-%m-%d"))

all result in error messages.

Upvotes: 5

Views: 42869

Answers (2)

Geoffrey Poole
Geoffrey Poole

Reputation: 1268

If you are running a plot in real time and don't mind some warnings, you can just pass, e.g., format = "%Y-%m-%d" in the plot function. For instance:

plot(seq((Sys.Date()-9),Sys.Date(), 1), runif(10), xlab = "Date", ylab = "Random")

yields: while:

plot(seq((Sys.Date()-9), Sys.Date(), 1), runif(10), format = "%Y-%m-%d", xlab = "Date", ylab = "Random")

yields: with lots of warnings about format not being a graphical parameter.

Upvotes: 0

MrFlick
MrFlick

Reputation: 206167

With plots it's very hard to reproduce results with out sample data. Here's a sample I'll use

dd<-data.frame(
  saldt=seq(as.Date("1999-01-01"), as.Date("2014-01-10"), by="6 mon"),
  salpr = cumsum(rnorm(31))
)

A simple plot with

with(dd, plot(saldt, salpr))

produces a few year marks

enter image description here

If i wanted more control, I could use axis.Date as you alluded to

with(dd, plot(saldt, salpr, xaxt="n"))
axis.Date(1, at=seq(min(dd$saldt), max(dd$saldt), by="30 mon"), format="%m-%Y")

which gives

enter image description here

note that xlim will only zoom in parts of the plot. It is not directly connected to the axis labels but the axis labels will adjust to provide a "pretty" range to cover the data that is plotted. Doing just

xlim=c(as.Date("1999-01-01"),as.Date("2014-01-01"))

is the correct way to zoom the plot. No need for conversion to numeric or POSIXct.

Upvotes: 17

Related Questions