moof
moof

Reputation: 189

plot same data two different ways, get different results (lattice xyplot)

I am trying to produce a scatter plot of some data. I do so in two different ways, as shown in code below (most of the code is just arranging data, the only graphing part is at the bottom). One uses a direct reference to the variables in the workspace, and the other arranges the data into an xts object first and then uses column indices to refer to them.

The resulting scatter plots are different, even though I have checked that the source data is the same in both ways.

I am wondering why these plots are different, thanks in advance.

# Get data
# =============
library('quantmod')

# Set monthly time interval
StartPeriod = paste0("1980-01")
EndPeriod = paste0("2014-07")
DateString = paste0(StartPeriod,"/", EndPeriod)

# CPI (monthly)
getSymbols("CPIAUCSL", src="FRED")

 # QoQ growth, Annualized
 CPIAUCSL = ((CPIAUCSL/lag(CPIAUCSL))^4-1)*100
 CPIAUCSL = CPIAUCSL[DateString]

# Oil prices (monthly)
getSymbols(c("MCOILWTICO"), src="FRED")

 # QoQ growth, annualized
 MCOILWTICO = ((MCOILWTICO/lag(MCOILWTICO))^4-1)*100
 MCOILWTICO = MCOILWTICO[DateString]


# Produce plots
# ===============
library('lattice')
# Method 1, direct reference
xyplot(CPIAUCSL~lag(MCOILWTICO,1), ylim=c(-5,6), 
   ylab="CPI", 
   xlab="Oil Price, 1 month lag",
   main="Method 1: Inflation vs. Lagged Oil Price",
   grid=TRUE)


# Method 2, refer to column indices of xts object
basket = merge(CPIAUCSL, MCOILWTICO)
xyplot(basket[ ,1] ~ lag(basket[ ,2],1), ylim=c(-5, 6), 
   ylab="CPI", 
   xlab="Oil Price, 1 month lag",
   main="Method 2: Inflation vs. Lagged Oil Price",
   grid=TRUE)


# Double check data fed into plots is the same
View(merge(CPIAUCSL, lag(MCOILWTICO,1)))
View(merge(basket[ ,1], lag(basket[ ,2],1))) # yes, matches

Upvotes: 0

Views: 68

Answers (1)

Tamas Ferenci
Tamas Ferenci

Reputation: 658

Method 1 is definitely incorrect as it will pair points 6 years apart! For instance, CPIAUCSL[3] is the data for 1980-03-01, while lag(MCOILWTICO,1)[3] corresponds to 1986-03-01 - however, on the scatterplot they will be paired! In contrast, basket[ ,1][3] and basket[ ,2][3] both belong to 1980-03-01.

(Your double check didn't show the problem, because there you used merge - as opposed to Method 1! - which solves the problem.)

Upvotes: 1

Related Questions