Reputation: 39
I was working on a SAS problem where I need to append the data. The data run is successful but it creates duplicates every time I run the program.
Please check my code and screenshot of the table:
Question: Create a new file "Total_Sales" by appending data file "Hyundai" with the file first created in problem 3.
/*Problem 3*/:
data avik1.var1;
length uniqueid $50 Manufacturer $ 50 Model $20 Sales_in_thousands 8 _4_year_resale_value 8 Price_in_thousands 8;
retain uniqueid Manufacturer Model Latest_Launch Sales_in_thousands _4_year_resale_value Price_in_thousands;
set avik1.conc(drop= Vehicle_type Engine_size Horsepower Wheelbase Width Length Curb_weight Fuel_capacity Fuel_efficiency );
informat Latest_Launch date9.;
format Latest_Launch ddmmyy10.;
run;
proc print data = avik1.var1;
run;
/* Data To be Appended */
data avik1.hyundai;
length uniqueid $ 50 Manufacturer $ 50 Model $20 Sales_in_thousands 8 _4_year_resale_value 8;
informat Latest_Launch date7. ;
format Latest_Launch ddmmyy10.;
input Manufacturer $ Model $ Sales_in_thousands _4_year_resale_value Latest_Launch;
uniqueid=(Model||Manufacturer);
cards;
Hyundai Tuscon 16.919 16.36 2Feb12
Hyundai i45 39.384 19.875 3Jun11
Hyundai Verna 14.114 18.225 4Jan12
Hyundai Terracan 8.558 29.775 10Mar11
;
run;
Proc Print data = avik1.hyundai;
run;
Now I used the following code to append:
data avik1.total_sales;
set avik1.var1 avik1.hyundai;
proc append base=avik1.var1 new=avik1.hyundai force;
run;
proc print data= avik1.total_sales;
run;
The program runs but gets me duplicates which you can check in the image Screenshot in Yellow Mark Shows Duplicates
I am new to SAS really appreciate your response and solution to this problem. Also please tell me why this is happening.
Thanks!
Upvotes: 1
Views: 713
Reputation: 475
Did you run it twice? I'm guessing but that could be the reason you see duplicates. I'll try to explain.
In your append code here, you are creating the new dataset total_sales by combining var1 and hyundai:
data avik1.total_sales;
set avik1.var1 avik1.hyundai;
In the below code, you are not creating a new dataset, you are expanding var1 by adding the records from hyundai.
proc append base=avik1.var1 new=avik1.hyundai force;
run;
If you ran this proc append and then ran the first data step again, you will have duplicates of all hyundai records because you are taking the EXPANDED var1 and re-adding the hyundai records.
So the point is, to answer the original question, the proc append procedure is totally unnecessary. You achieved it with just the data step.
Upvotes: 2