SimoneL
SimoneL

Reputation: 13

Replace values from one existing dataset with values from another existing dataset

I have dataset1 that I want to replace the current values for variableX with the values for that same variable from dataset2. I want the rest of dataset1 to remain the same.

How do I do this?

Also, dataset1 has 89 observations and dataset2 has 9098 observations. I only need the first 89 observations from dataset2.

thank you

I tried the following

libname data '\\filepath\dataset1folder';
libname data2 '\\filepath\dataset2folder';

data data.dataset1;
    set data2.dataset2;
    CWEIGHT = CWEIGHT;
RUN;

Upvotes: 0

Views: 480

Answers (2)

whymath
whymath

Reputation: 1394

You can use the merge statement and in= dataset option.

data want;
  merge data.dataset1(in=_in1_) data2.dataset2(keep=variableX);
  if _in1_=1;
run;

Image the datasets are boxes, the merge statement just places these boxes next to each other horizontally and aligns their tops. If there are same name variables in different datasets, the former one will be rewrite by the following one.
Following a dataset, the in= option create a 0-1 variable, _in1_ in this example, to indicate if one observation comes from this dataset or not. Since you only want to observations from dataset1, you can use if _in1_=1 to subset the result of merging.

Upvotes: 0

mjsqu
mjsqu

Reputation: 5452

Your requirements:

  • Replace the current values in dataset1 for variableX with the values for that same variable from dataset2. I want the rest of dataset1 to remain the same.
  • I only need the first 89 observations from dataset2.

You need to obtain the first 89 observations from dataset2 and only variableX. This can be achieved using KEEP= and OBS=:

set data2.dataset2 (KEEP=variableX OBS=89);

And you want to update dataset1 using this data, so you need to use a datastep that will read and write dataset1, while also incorporating the above:

libname data '\\filepath\dataset1folder';
libname data2 '\\filepath\dataset2folder';

data data.dataset1;
    /* The data step is a loop - this statement loads the first row of dataset1 */
    set data.dataset1;
    /* This loads the first row of dataset2, but only variableX - if variableX is present in dataset1 then it is overwritten */
    set data2.dataset2 (KEEP=variableX OBS=89);
RUN;

Upvotes: 0

Related Questions