Jeff M
Jeff M

Reputation: 23

SAS Data Conversion

Can someone help translate this PROC SQL code to a Data Step? I am hoping that the Data Step will be much faster. I generated this code from SAS-EG so I am not proficient enough yet.

Thanks Jeff

PROC SQL;
  CREATE TABLE WORK.'APPAREL NEWNESS 01'n AS 
  SELECT DISTINCT t2.outletfamily AS outletfamily, 
      t2.itemid, 
      t2.itemnumberunsuppressed AS itemnumber_unsuppressed, 
      t2.ppmonth, 
      /* Dollars */
        (SUM(t2.totalvalue)) FORMAT=DOLLAR20. AS Dollars, 
      /* Units */
        (SUM(t2.unitssold)) FORMAT=COMMA20. AS Units, 
      /* Proj Dollars */
        (SUM(t2.proj_totalvalue)) FORMAT=DOLLAR20. AS 'Proj Dollars'n, 
      /* Proj Units */
        (SUM(t2.proj_unitssold)) FORMAT=COMMA20. AS 'Proj Units'n
  FROM APLSWP.vw_aplmlwk_fact_nat_nc_uns t2
  GROUP BY t2.outletfamily,
           t2.itemid,
           t2.itemnumberunsuppressed,
           t2.ppmonth;
  QUIT;

Upvotes: 0

Views: 54

Answers (2)

Reeza
Reeza

Reputation: 21294

I would suggest using PROC MEANS instead. Without sample data I can't be sure this is exactly what you're looking for but should get you close:

  proc means data=APLSWP.vw_aplmlwk_fact_nat_nc_uns noprint nway;
  *CLASS = grouping variables;
  class outletfamily itemid itemnumberunsuppressed ppmonth;
  *VAR = list of variables to analyze (optional);
  var totalValue UnitSold Proj_Total_value proj_unitssold;
  *create output data set;
  output out=want 
    /*list of summary statistics to save*/
    sum(totalValue) = dollars 
    sum(UnitSold) = units 
    sum(Proj_Total_value) = 'Proj Dollars'n 
    sum(proj_unitssold) = 'Proj Units'n;
  run;

Upvotes: 0

Vojta F
Vojta F

Reputation: 564

  • The functionality of GROUP BY and SUM are available in proc summary or in a simlar, yet much richer proc means. Unfortunately I do not have a SAS licence available so can't provide you with a code. Look for an example of your problem solved in proc means here.
  • When using GROUP BY var1, var2, var3, var4, the procedure produces distinct combinations of values of var1, var2, var3 and var4. The keyword DISTINCT is unnecessary though.
  • The performance could be affected by where the data resides vs. where you run the query. If the query is in an external database (e.g. Oracle) you might benefit from looking at so-called "pass-through facility". This allow SAS code run directly in the DB and utilize its indexes, statistics and all the DB features that make databases powerful.

Upvotes: 2

Related Questions