Wasabi
Wasabi

Reputation: 3061

Wrong column on data.table merge

Let's say I have these two tables:

library(data.table)

x <- data.table(Date = as.Date(c("1990-01-29", "1990-02-30",
                                 "1990-01-31", "1990-02-01",
                                 "1990-02-02", "1990-02-05",
                                 "1990-02-06", "1990-02-07",
                                 "1990-02-08", "1990-02-09")),
                a = c(1, 1, 2, 3, 5, 8, 13, 21, 34, 55))

y <- data.table(Date1 = as.Date(c("1990-01-31", "1990-02-06", "1990-02-07")),
                Date2 = as.Date(c("1990-02-06", "1990-02-07", "1990-02-09")),
                b = c(5, 2, 4))

Table y is really a descriptor of different "periods" starting at Date1 and ending at Date2 (such that one row's Date2 is the next row's Date1), with a (non-unique) descriptor of that period.

I'd now like to merge these tables, such that for each date of x have both a and the respective y$b (dates outside of the period should be dropped). I tried the following, but it's not right:

x[y, on = .(Date > Date1, Date <= Date2)]
#          Date  x     Date.1 y
# 1: 1990-01-31  3 1990-02-06 5
# 2: 1990-01-31  5 1990-02-06 5
# 3: 1990-01-31  8 1990-02-06 5
# 4: 1990-01-31 13 1990-02-06 5
# 5: 1990-02-06 21 1990-02-07 2
# 6: 1990-02-07 34 1990-02-09 4
# 7: 1990-02-07 55 1990-02-09 4

Specifically, the Date column isn't x$Date, but actually y$Date1, repeated as necessary, while the Date.1 column is Date2.

Meanwhile, the expected output would be

#          Date  x y
# 1: 1990-02-01  3 5
# 2: 1990-02-02  5 5
# 3: 1990-02-05  8 5
# 4: 1990-01-06 13 5
# 5: 1990-02-07 21 2
# 6: 1990-02-08 34 4
# 7: 1990-02-09 55 4

Upvotes: 2

Views: 99

Answers (2)

s_baldur
s_baldur

Reputation: 33488

You can refer to the columns of each table using x. and i.

x[y, 
  on = .(Date > Date1, Date <= Date2), 
  .(Date = x.Date, x = x.a, y = i.b)]

         Date  x y
1: 1990-02-01  3 5
2: 1990-02-02  5 5
3: 1990-02-05  8 5
4: 1990-02-06 13 5
5: 1990-02-07 21 2
6: 1990-02-08 34 4
7: 1990-02-09 55 4

Upvotes: 0

akrun
akrun

Reputation: 887118

It may be better to create a duplicate column

x[,.(Daten = Date, Date, a)][y,
      on = .(Date > Date1, Date <= Date2)][, .(Date = Daten, a, b)]
#         Date  a b
#1: 1990-02-01  3 5
#2: 1990-02-02  5 5
#3: 1990-02-05  8 5
#4: 1990-02-06 13 5
#5: 1990-02-07 21 2
#6: 1990-02-08 34 4
#7: 1990-02-09 55 4

Upvotes: 2

Related Questions