mabe
mabe

Reputation: 13

Merging two datasets conditional on 3 variables in SAS

I am quite new to using SAS and would like to ask for your help on merging two datasets that do not contain one key identifier but require a combination of three conditions to be merged successfully.

I have two datasets:

The goal is to keep the first dataset and add information from the second dataset in case it is available - based on a match of date, employee number and customer number.

It is important that the date variable in the activity dataset is just on day while in the visits dataset it is a range between start and end date. So, the data sets look like this: Base datasets

Logically, it should be a merge based on three conditions:

  1. IF EmployeeNumber = EmployeeNumber
  2. IF CustomerNumber = CustomerNumber
  3. IF Date is between StartDate and EndDate
    Then: Update visits dataset with information from activity dataset

So, the ideal new dataset would look like this: Intended result of merge
If there is no entry in the activity dataset, the new data set should simply contain a blank for the respective variables. For all matches, it should add the respective variable.

When trying (rather simple) merge statements in SAS, I usually get an error message stating that "BY variables are not properly sorted on data set". Furthermore, I am not sure how to implement the conditions (in particular the match of date to a range) properly.

Can this sort of merge be done in SAS or SQL?
Any help is greatly appreciated!!!

Upvotes: 0

Views: 1731

Answers (1)

Juha K
Juha K

Reputation: 324

I tend to use proc sql for this kind of stuff, it's just a relatively simple join in which SQL excels. Try something like this:

proc sql;

create table merged_data as
  select a.*, b.other_variable, b.other_variable2
  from dataset1 as a
  left join dataset2 as b 
    on a.EmployeeNumber = b.EmployeeNumber 
    and a.CustomerNumber = b.CustomerNumber
    and b.Date between a.StartDate and a.EndDate
;

quit;

Upvotes: 0

Related Questions