Reputation: 3189
I have two data frames with two columns each. The first column is timestamps and the second contains some values. One of the data frames is much bigger than the other one but both of them contains data in the same timestamp range.
If I plot these two on top of each other, I will get a nice plot showing how they differ in time.
Now I would like to get the absolute difference by time of these two dataframes to make a another plot showing how much they differ (or to create a boxplot with this information) even though they do not have the same length and exact matching timestamps.
Check this example:
df1:
timestamp | data
1334103075| 1.2
1334103085| 1.5
1334103095| 0.9
1334103105| 0.7
1334103115| 1.1
1334103125| 0.8
df2:
timestamp | data
1334103078| 1.2
1334103099| 1.5
1334103123| 0.8
1334103125| 0.9
How would I achieve something like this:
df3 <- abs(df1-df2)
As you see df2 might not have the same timestamps as df1, but they both have timestamps in the same time range.
Of course the subtraction should try to match timestamps or subtract values from timestamp averages that they are near.
Upvotes: 4
Views: 1673
Reputation: 89057
I would suggest using two linear interpolators and evaluate both of them on the union of your two sets of timestamps:
df1 <- data.frame(timestamp = c(1334103075, 1334103085, 1334103095,
1334103105, 1334103115, 1334103125),
data = c(1.2, 1.5, 0.9, 0.7, 1.1, 0.8))
df2 <- data.frame(timestamp = c(1334103078, 1334103099, 1334103123,
1334103125),
data = c(1.2, 1.5, 0.8, 0.9))
library(Hmisc)
all.timestamps <- sort(unique(c(df1$timestamp, df2$timestamp)))
data1 <- approxExtrap(df1$timestamp, df1$data, all.timestamps)$y
data2 <- approxExtrap(df2$timestamp, df2$data, all.timestamps)$y
df3 <- data.frame(timestamp = all.timestamps,
data1 = data1,
data2 = data2,
abs.diff = abs(data1 - data2))
df3
# timestamp data1 data2 abs.diff
# 1 1334103075 1.20 1.157143 0.04285714
# 2 1334103078 1.29 1.200000 0.09000000
# 3 1334103085 1.50 1.300000 0.20000000
# 4 1334103095 0.90 1.442857 0.54285714
# 5 1334103099 0.82 1.500000 0.68000000
# 6 1334103105 0.70 1.325000 0.62500000
# 7 1334103115 1.10 1.033333 0.06666667
# 8 1334103123 0.86 0.800000 0.06000000
# 9 1334103125 0.80 0.900000 0.10000000
Then you could consider fitting splines if you are not quite happy with linear approximations.
Upvotes: 3