Reputation: 131
I have a Summary_Table that has Store, Product, and Date. And I need to create a 4th column that counts the number of observations in the Records_table that meets multiple criteria per row of the Summary_Table.
This is my summary table.
> Summary_Table
Store Product Date
Chain1 ABC 1/1/2007
Chain1 ABC 1/1/2008
Chain1 ABC 1/1/2009
Chain1 ABC 1/1/2010
Chain1 DEF 1/1/2007
Chain1 DEF 1/1/2008
Chain1 DEF 1/1/2009
Chain1 DEF 1/1/2010
Chain2 ABC 1/1/2007
Chain2 ABC 1/1/2008
Chain2 ABC 1/1/2009
Chain2 ABC 1/1/2010
Chain2 DEF 1/1/2007
Chain2 DEF 1/1/2008
Chain2 DEF 1/1/2009
Chain2 DEF 1/1/2010
And this is my Records Table
> Records_Table
Store Product Eval_Date
Chain1 DEF 2/13/2008
Chain2 ABC 1/15/2009
Chain1 DEF 5/13/2009
Chain2 ABC 3/15/2007
Chain1 DEF 5/14/2010
Chain2 DEF 10/10/2010
Chain1 ABC 11/1/2007
Chain2 ABC 10/1/2008
Chain2 DEF 6/1/2009
Chain2 DEF 7/1/2010
Chain2 ABC 8/1/2009
Chain1 ABC 2/1/2009
So I need to create a 4th column in Summary table that counts the number of observations in Records table. That meets the following criteria per row of the summary table.
Store should be equal, Product should be equal, And for the dates; if the (Records_Table$Eval_Date) is greater than (Summary_Table$Date).
So my desired output in the Summary_Table is
> Summary_Table
Store Product Date Count
Chain1 ABC 1/1/2007 2
Chain1 ABC 1/1/2008 1
Chain1 ABC 1/1/2009 1
Chain1 ABC 1/1/2010 0
Chain1 DEF 1/1/2007 3
Chain1 DEF 1/1/2008 3
Chain1 DEF 1/1/2009 2
Chain1 DEF 1/1/2010 1
Chain2 ABC 1/1/2007 4
Chain2 ABC 1/1/2008 3
Chain2 ABC 1/1/2009 2
Chain2 ABC 1/1/2010 0
Chain2 DEF 1/1/2007 3
Chain2 DEF 1/1/2008 3
Chain2 DEF 1/1/2009 3
Chain2 DEF 1/1/2010 2
Upvotes: 1
Views: 305
Reputation: 2764
You can do left join
using data.table
package-
library(data.table)
setDT(recdt)[,Eval_Date:=as.Date(Eval_Date,format="%m/%d/%Y")]
setDT(sumdt)[,Date:=as.Date(Date,format="%m/%d/%Y")]
recdt[sumdt,.N, on=c("Store","Product","Eval_Date>Date"), by=.EACHI]
Output-
Store Product Eval_Date N
1: Chain1 ABC 2007-01-01 2
2: Chain1 ABC 2008-01-01 1
3: Chain1 ABC 2009-01-01 1
4: Chain1 ABC 2010-01-01 0
5: Chain1 DEF 2007-01-01 3
6: Chain1 DEF 2008-01-01 3
7: Chain1 DEF 2009-01-01 2
8: Chain1 DEF 2010-01-01 1
9: Chain2 ABC 2007-01-01 4
10: Chain2 ABC 2008-01-01 3
11: Chain2 ABC 2009-01-01 2
12: Chain2 ABC 2010-01-01 0
13: Chain2 DEF 2007-01-01 3
14: Chain2 DEF 2008-01-01 3
15: Chain2 DEF 2009-01-01 3
16: Chain2 DEF 2010-01-01 2
Upvotes: 1
Reputation: 1248
I'm assuming that the date column in your summary table is a grouping by year. If so, using lubridate and dplyr:
library(dplyr)
library(lubridate)
Records <- Records %>%
mutate(Eval_Date = mdy(Eval_Date), Year = year(Eval_Date))
summary <- Records %>%
group_by(Store, Product, Year) %>%
summarise(Count = n()) %>%
Upvotes: 0