Rods2292
Rods2292

Reputation: 675

Transposing rows to columns in SAS or SQL

I'm a new SAS/SQL user and I have a dataset which I need to transpose some rows to columns. I think that there is a faster or easier way to do that and I'd like to some advice to you all. My example will better explain my issue:

Here is the dataset I have:

Month   ID     Car      Claim_Type   Cost_of_claim
  1    1243   Ferrari    Collision      12,000
  2    6437   Peugeot    Fire           50,000
  5    0184   Citroen    Stole           3,000
  9    1930   Fiat       Medical         1,000
  3    2934   GM         Liability      20,000

And I need to create a dataset like that:

Month   ID     Car    Collision   Fire    Stole   Medical Liability
1    1243   Ferrari    12,000       0       0       0         0 
2    6437   Peugeot       0      50,000     0       0         0         
5    0184   Citroen       0         0      3,000    0         0
9    1930   Fiat          0         0       0     1,000       0
3    2934   GM            0         0       0       0      20,000

I just transposed some rows to columns...

I was thinking in do something like that to create my new dataset:

proc sql;
select Month, ID, CAR
  case when Claim_Type = 'Collision' then Cost_of_claim end Collision,
  case when Claim_Type = 'Fire'      then Cost_of_claim end Fire,
  case when Claim_Type = 'Stole'     then Cost_of_claim end Stole,
  case when Claim_Type = 'Medical'   then Cost_of_claim end Medical,
  case when Claim_Type = 'Liability' then Cost_of_claim end Liability
from my_table;

The problem is that a have a huge amount of data and I think that this way might be not too efficient. Also, in my dataset I have much more columns and rows and don't want to type all the possibilities in case when statements as it does not seems to be to easy (or user-friendly) to maintain the code.

Can someone help me in how to solve that?

Upvotes: 1

Views: 15238

Answers (3)

dj_paige
dj_paige

Reputation: 353

PROC TRANSPOSE ought to do what you want.

data test;
  input Month   ID     Car $     Claim_Type : $12. Cost_of_claim;
  cards;
  1    1243   Ferrari    Collision      12000
  2    6437   Peugeot    Fire           50000
  5    0184   Citroen    Stole           3000
  9    1930   Fiat       Medical         1000
  3    2934   GM         Liability      20000
run;

proc transpose data=test out=transposed;
  by notsorted month notsorted id notsorted car;
  var cost_of_claim;
  id claim_type;
run;

The output data set doesn't have the zeros off-diagonal but you can add those in a data step if you really want them.

Upvotes: 4

Sean
Sean

Reputation: 1120

This method populates a macro variable with all possible claim_types and loops through them, generating variables in the same way that your example code does, so you won't need to enter all possible cases. The "backstop" variable is used because of the comma in the loop (SAS will error without one more variable after the last comma in a proc sql step).

data have;
   input Month ID Car $12. Claim_Type $12. Cost_of_claim;
   datalines;
  1    1243   Ferrari    Collision      12000
  2    6437   Peugeot    Fire           50000
  5    0184   Citroen    Stole           3000
  9    1930   Fiat       Medical         1000
  3    2934   GM         Liability      20000
    ;
run;


%macro your_macro;

    proc sql noprint;
        select distinct claim_type into: list_of_claims separated by " " from have;

        create table want (drop = backstop) as select
            month, id, car,
                %do i = 1 %to %sysfunc(countw(&list_of_claims.));
                %let this_claim = %scan(&list_of_claims., &i.);
                    case when claim_type = "&this_claim." then cost_of_claim else 0 end as &this_claim.,
                %end;
            1 as backstop
        from have;
    quit;

%mend your_macro;

%your_macro;

Upvotes: 0

artm
artm

Reputation: 8584

You can try dynamic sql and pivot but performance will depend on how many different claim types you have.

create table #mytable (Month int, ID int, Car varchar(20), Claim_Type varchar(20),  Cost_of_claim int)

insert into #mytable values 
(1, 1243, 'Ferrari', 'Collision', 12000)
, (2, 6437, 'Peugeot', 'Fire', 50000)
, (5, 184, 'Citroen', 'Stole', 3000)
, (9, 1930, 'Fiat', 'Medical', 1000)
, (3, 2934, 'GM', 'Liability', 20000)
, (12, 4455, 'Ford', 'Theft', 20)


DECLARE @cols AS NVARCHAR(MAX),
    @query  AS NVARCHAR(MAX)

select @cols = STUFF((SELECT ',' + QUOTENAME(Claim_Type) 
                    from #mytable
                    group by Claim_Type
                    order by Claim_Type
            FOR XML PATH(''), TYPE
            ).value('.', 'NVARCHAR(MAX)') 
        ,1,1,'')

set @query = N'SELECT ' + 'month,id,car,' + @cols + N' from 
             (
                select month,id, car, Cost_of_claim, Claim_Type
                from #mytable               
            ) x
            pivot 
            (
                max(Cost_of_claim)
                for Claim_Type in (' + @cols + N')
            ) p 
            '

exec sp_executesql @query;

drop table #mytable

Upvotes: 0

Related Questions