Aman kashyap
Aman kashyap

Reputation: 143

Splitting a Column into two based on condtions in Proc Sql ,SAS

I want to Split the airlines column into two groups and then Add each group 's amount for all clients... : -

Group 1 = Air India & jet airways | Group 2 = Others.

Loc  Client_Name     Airlines           Amout
BBI   A_1ABC2        Air India          41302
BBI   A  1ABC2       Air India          41302
MAA   Th 1ABC2       Spice Jet Airlines 288713
HYD   Ma 1ABC2       Jet Airways        365667
BOM   Vi 1ABC2       Air India          552506

Something like this: -

Rank    Client_name Group1  Group2  Total   
1       Ca 1ABC2    5266269 7040320 1230658 
2       Ve 1ABC2    2815593 2675886 5491479 
3       Ma 1ABC2    1286686 437843  1724529 
4       Th 1ABC2    723268  701712  1424980 
5       Ec 1ABC2    113517  627734  741251  
6       A  1ABC2    152804  439381  592185  

I grouped it first ..but i am confused regarding how to split: -

Data assign6.Airlines_grouping1;
Set assign6.Airlines_grouping;

if Scan(Airlines,1) IN ('Air','Jet') then Group = "Group1";
else
if Scan(Airlines,1) Not in('Air','Jet') then Group = "Group2";

Run;

Upvotes: 2

Views: 1143

Answers (1)

Richard
Richard

Reputation: 27508

You are categorizing a row based on the first word of the airline.

Proc TRANSPOSE with an ID statement is one common way to reshape data so that a categorical value becomes a column. A second way is to bypass the categorization and use a data step to produce the new shape of data directly.

Here is an example of the second way -- create new columns group1 and group2 and set value based on airline criteria.

data airlines_group_amounts;
  set airlines;

  if scan (airlines,1) in ('Air', 'Jet') then
    group1 = amount;
  else
    group2 = amount;
run;

summarize over client

proc sql;
  create table want as
  select
    client_name
  , sum(group1) as group1
  , sum(group2) as group2
  , sum(amount) as total
  from airlines_group_amounts
  group by client_name
;

You can avoid the two steps and do all of the processing in a single query, or you can do the summarization with Proc MEANS

Here is a single query way.

proc sql;
  create table want as
  select
    client_name
  , sum(case when scan (airlines,1) in ('Air', 'Jet') then amount else 0 end) as group1
  , sum(case when scan (airlines,1) in ('Air', 'Jet') then 0 else amount end) as group2
  , sum(amount) as total
  from airlines
  group by client_name
;

Upvotes: 3

Related Questions