amit
amit

Reputation: 3462

concatenating/ merging time series (in R)

I have to xts/zoo objects. each has measures of different variables over a different time span. I want to create a single time series comprising all measures at all times, with NAs for missing dates/variable combinations. how do I do that? artificial example:

library(xts)
x<-cbind(a=1:3,b=3:1)
rownames(x) = as.character(Sys.Date()-1:3)

y<-cbind(a=5:7,c=3:1)
rownames(y) = as.character(Sys.Date()-5:7)

xs=as.xts(x)
ys=as.xts(y)

#now what?

#desired outcome looks like:
            a  b   c
2013-03-10  7 NA   1
2013-03-11  6 NA   2
2013-03-12  5 NA   3
2013-03-14  3  1  NA
2013-03-15  2  2  NA
2013-03-16  1  3  NA

# regular merge looks like that (adding an a.1 variable)
merge(xs,ys)
            a  b a.1  c
2013-03-10 NA NA   7  1
2013-03-11 NA NA   6  2
2013-03-12 NA NA   5  3
2013-03-14  3  1  NA NA
2013-03-15  2  2  NA NA
2013-03-16  1  3  NA NA

# simple concatenation ignores variable names and looks like that
c(xs,ys)
           a b
2013-03-10 7 1
2013-03-11 6 2
2013-03-12 5 3
2013-03-14 3 1
2013-03-15 2 2
2013-03-16 1 3

# so what should I do?

Upvotes: 7

Views: 14484

Answers (4)

H&#233;ctor Andrade
H&#233;ctor Andrade

Reputation: 31

I would say to convert it to an numerical array (as.numeric(ts)), concatenate it with cbind(ts1,ts2) and then go back to the time series, ts(c(as.numeric(ts1),as.numeric(ts2))

Upvotes: 0

amit
amit

Reputation: 3462

ok. spent some time thinking on this. because eventually I need to "merge" many such dataframes/xts into one, and not just merging two of them, I thought that it makes sense to do all this in one step: create a big matrix of all date/var combinations. then plugging into this big matrix all the observed data, object by object. the code looks as follows (will be happy to get comments on it, and feel free to use, without any kind of warranty, of course):

alltogether = function(dlist) {
    all.vars = unique(unlist(lapply(dlist,colnames)))
    all.obs = unique(unlist(lapply(dlist,rownames)))    
    res = array(NA,dim=c(length(all.obs),length(all.vars)),
          dimnames=list(all.obs,all.vars))
    for(d in dlist) {
            res[rownames(d),colnames(d)]=d
    }
    return(res)
}

alltogether.xts = function(xlist) {
    dlist = lapply(xlist,as.matrix)
    res = alltogether(dlist)
    xres = as.xts(res)
    return(xres)
}

Upvotes: 1

statquant
statquant

Reputation: 14400

What you want => merge(data.frame(x,d),data.frame(y,d),by=c("d","a"),all=T)

You should use data.frame not name vectors/matrix, here is a general solution, what you want is just a one liner with a full outer join (look at ?merge)

x<-cbind(a=1:3,b=3:1)
d= as.character(Sys.Date()-1:3)
DT1 = data.frame(x,d)
#DT1
#   a b          d
#1: 1 3 2013-03-16
#2: 2 2 2013-03-15
#3: 3 1 2013-03-14

y<-cbind(a=5:7,c=3:1)
d = as.character(Sys.Date()-5:7)
DT2 = data.frame(y,d)
#DT2
#   a b          d
#1: 1 3 2013-03-12
#2: 2 2 2013-03-11
#3: 3 1 2013-03-10
merge(DT1,DT2,by=c("d","a"),all=T)
#           d a  b  c
#1 2013-03-10 7 NA  1
#2 2013-03-11 6 NA  2
#3 2013-03-12 5 NA  3
#4 2013-03-14 3  1 NA
#5 2013-03-15 2  2 NA
#6 2013-03-16 1  3 NA

Upvotes: 2

agstudy
agstudy

Reputation: 121626

This is not a general solution. But works for this example:

cbind(rbind(xs[,1],ys[,1]), cbind(xs[,-1],ys[,-1]))
           a  b  c
2013-03-10 7 NA  1
2013-03-11 6 NA  2
2013-03-12 5 NA  3
2013-03-14 3  1 NA
2013-03-15 2  2 NA
2013-03-16 1  3 NA

Just remind that cbind.xts is just merge.xts. S you can get the same result using merge

merge(rbind(xs[,1],ys[,1]), merge(xs[,-1],ys[,-1]))
           a  b  c
2013-03-10 7 NA  1
2013-03-11 6 NA  2
2013-03-12 5 NA  3
2013-03-14 3  1 NA
2013-03-15 2  2 NA
2013-03-16 1  3 NA

The problem with this solution is that if ys and xs have some dates incommon, you will have duplicated index in your final xts object.For example, if we replace y :

rownames(y) = as.character(Sys.Date()-3:5)

You get , a duplicated index for 2013-03-14, So im anot sure that it s a valid xts object.

merge(rbind(xs[,1],ys[,1]), merge(xs[,-1],ys[,-1]))
           a  b  c
2013-03-12 7 NA  1
2013-03-13 6 NA  2
2013-03-14 3  1  3
2013-03-14 5 NA NA
2013-03-15 2  2 NA
2013-03-16 1  3 NA

EDIT a generalization of the solution:

inter <- intersect(names(ys), names(xs))
diffx <- setdiff(names(xs),inter)
diffy <- setdiff(names(ys),inter)

merge(rbind(xs[,inter],ys[,inter]), merge(xs[,diffx],ys[,diffy]))


           a  b  c
2013-03-10 7 NA  1
2013-03-11 6 NA  2
2013-03-12 5 NA  3
2013-03-14 3  1 NA
2013-03-15 2  2 NA
2013-03-16 1  3 NA

Upvotes: 4

Related Questions