Reputation: 95
I'm trying to figure out how to merge two time series with different frequencies.The first time series are on 1 minute interval
time1<-seq(from=as.POSIXct("2010-03-01 13:02"),to=as.POSIXct("2010-03-01 13:10"),by="1 min")
value<-round(matrix(runif(90,1,10),9,10),2)
data<-data.frame(time1,value)
data
time1 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 2010-03-01 13:02:00 2.03 9.06 6.93 8.78 1.62 6.79 7.30 8.58 7.17 5.71
2 2010-03-01 13:03:00 6.68 6.36 1.32 1.32 7.40 6.46 5.86 9.22 7.48 2.56
3 2010-03-01 13:04:00 2.68 1.26 7.54 9.32 2.20 5.83 2.58 2.33 9.13 6.77
4 2010-03-01 13:05:00 9.82 9.78 2.73 6.68 8.59 8.18 4.59 3.84 6.65 9.28
5 2010-03-01 13:06:00 9.47 5.97 5.18 7.32 2.66 2.62 9.14 2.81 1.60 3.29
6 2010-03-01 13:07:00 9.11 4.92 2.03 3.10 3.77 9.96 7.30 8.19 6.06 6.32
7 2010-03-01 13:08:00 4.46 1.42 9.97 8.96 7.16 2.21 1.72 9.08 3.45 6.27
8 2010-03-01 13:09:00 7.43 8.99 9.47 5.66 1.43 4.34 9.94 7.62 1.34 8.55
9 2010-03-01 13:10:00 6.10 2.09 6.47 3.83 7.05 9.65 4.84 1.34 6.14 6.63
The second time series are on 5 min interval
time2<-seq(from=as.POSIXct("2010-03-01 13:00"),to=as.POSIXct("2010-03-01 13:10"),by="5 min")
value2<-round(matrix(runif(6,1,10),3,2),2)
data2<-data.frame(time2,value2)
data2
time2 X1 X2
1 2010-03-01 13:00:00 4.55 3.93
2 2010-03-01 13:05:00 4.05 8.04
3 2010-03-01 13:10:00 6.87 7.93
The preferred output would combine the two time series on the 5 minute interval so I would have a new data frame:
1 2010-03-01 13:05:00 4.05 8.04 9.82 9.78 2.73 6.68 8.59 8.18 4.59 3.84 6.65 9.28
2 2010-03-01 13:10:00 6.87 7.93 6.10 2.09 6.47 3.83 7.05 9.65 4.84 1.34 6.14 6.63
For sure there is simple function for this but I'm unable to find it.
Upvotes: 1
Views: 2957
Reputation: 20362
# MERGE DATA SETS IF ELEMENTS ARE THE SAME IN COLUMN A
require(reshape2)
# Set WD
setwd("C:\\your_path_here\\")
# Load data from all CSV files
NumOne <- c("Book1.csv")
NumOneData <- as.matrix((read.csv(NumOne,header=T,sep=",",check.names = F)))
NumTwo = c("Book2.csv")
NumTwoData <- as.matrix((read.csv(NumTwo,header=T,sep=",",check.names = F)))
myfulldata1 = merge(NumOneData, NumTwoData)
write.table(myfulldata1, file = "C:\\your_path_here\\foo1.csv", sep = ",", col.names = NA, qmethod = "double")
NumOneData$CONTRIBUTING_GROUP <- match(NumOneData$CONTRIBUTING_GROUP, NumTwoData$CONTRIBUTING_GROUP, nomatch=0)
write.table(nomatch, file = "C:\\your_path_here\\foo2.csv", sep = ",", col.names = NA, qmethod = "double")
merged.data.all3 <- merge(NumOneData, NumTwoData, by="CONTRIBUTING_GROUP", all=TRUE)
write.table(merged.data.all3, file = "C:\\your_path_here\\foo3.csv", sep = ",", col.names = NA, qmethod = "double")
# mydata <- merge(df1, df2, by=c("country","year"))
# When common ids have different names use by.x and by.y to match them. R will keep the name of the first dataset (by.x)
# When different variables from two different dataset have the same name, R will assign a suffix .x or .y to make them unique and to identify which dataset they are coming from.
# mydata <- merge(mydata1, mydata2, by.x=c("country","year"), by.y=c("nations","time"))
# Note: The default setting of the merge() function drops all unmatched cases. If you want to keep all cases in the new data set, include the option all=TRUE in the merge() function:
# merged.data.all <- merge(dataset1, dataset2, by="countryID", all=TRUE)
You should probably keep these in a safe place as well.
Inner join: merge(df1, df2) will work for these examples because R automatically joins the frames by common variable names, but you would most likely want to specify merge(df1, df2, by = "CustomerId") to make sure that you were matching on only the fields you desired. You can also use the by.x and by.y parameters if the matching variables have different names in the different data frames.
Outer join: merge(x = df1, y = df2, by = "CustomerId", all = TRUE)
Left outer: merge(x = df1, y = df2, by = "CustomerId", all.x = TRUE)
Right outer: merge(x = df1, y = df2, by = "CustomerId", all.y = TRUE)
Cross join: merge(x = df1, y = df2, by = NULL)
All those concepts come from here.
How to join (merge) data frames (inner, outer, left, right)?
Upvotes: 1
Reputation: 13580
If the columns used for merging have different names, you need to specify them using by.x
(first table) and by.y
(second table). Check ?merge
for more details.
merge(data2, data, by.x = "time2", by.y = "time1")
Output
time2 X1.x X2.x X1.y X2.y X3 X4 X5 X6 X7 X8 X9 X10
1 2010-03-01 13:05:00 1.53 8.01 9.17 7.18 2.91 5.34 4.70 7.59 5.67 5.31 9.03 7.81
2 2010-03-01 13:10:00 6.78 8.18 6.66 9.93 1.12 7.02 5.77 3.20 5.13 8.55 4.91 2.29
Data
set.seed(1)
time1<-seq(from=as.POSIXct("2010-03-01 13:02"),to=as.POSIXct("2010-03-01 13:10"),by="1 min")
value<-round(matrix(runif(90,1,10),9,10),2)
data<-data.frame(time1,value)
data
time1 X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 2010-03-01 13:02:00 3.39 1.56 4.42 4.44 8.15 8.10 1.64 3.99 4.12 7.41
2 2010-03-01 13:03:00 4.35 2.85 8.00 8.83 1.97 1.21 1.90 6.86 4.00 4.60
3 2010-03-01 13:04:00 6.16 2.59 9.41 4.06 7.51 5.30 3.85 3.32 5.29 3.93
4 2010-03-01 13:05:00 9.17 7.18 2.91 5.34 4.70 7.59 5.67 5.31 9.03 7.81
5 2010-03-01 13:06:00 2.82 4.46 6.87 6.40 8.39 7.23 6.96 7.90 8.78 2.82
6 2010-03-01 13:07:00 9.09 7.93 2.13 5.44 6.82 5.30 4.66 1.76 4.51 7.40
7 2010-03-01 13:08:00 9.50 5.48 3.40 2.68 8.05 8.75 9.22 8.88 8.00 2.10
8 2010-03-01 13:09:00 6.95 7.46 4.48 8.45 5.98 4.94 3.64 4.05 9.65 3.21
9 2010-03-01 13:10:00 6.66 9.93 1.12 7.02 5.77 3.20 5.13 8.55 4.91 2.29
time2<-seq(from=as.POSIXct("2010-03-01 13:00"),to=as.POSIXct("2010-03-01 13:10"),by="5 min")
value2<-round(matrix(runif(6,1,10),3,2),2)
data2<-data.frame(time2,value2)
data2
time2 X1 X2
1 2010-03-01 13:00:00 3.16 8.89
2 2010-03-01 13:05:00 1.53 8.01
3 2010-03-01 13:10:00 6.78 8.18
Upvotes: 2
Reputation: 1937
a simple merge will do
time1<-seq(from=as.POSIXct("2010-03-01 13:02"),to=as.POSIXct("2010-03-01 13:10"),by="1 min")
value<-round(matrix(runif(90,1,10),9,10),2)
data<-data.frame(time=time1,value)
time2<-seq(from=as.POSIXct("2010-03-01 13:00"),to=as.POSIXct("2010-03-01 13:10"),by="5 min")
value2<-round(matrix(runif(6,1,10),3,2),2)
data2<-data.frame(time=time2,value2)
merge(data2,data,by="time")
Upvotes: 0