eee333
eee333

Reputation: 51

Proc sql subtotals and subqueries

I am very new to Proc sql and feel like I'm hitting my head against a wall. Basically, I want to make some different types of subtotals, and I've been struggling. I can figure out how to do it in two sql statements, but I can't figure out how to combine them into one statement using a subquery, which seems like it should be possible.

Sample data

*job, gender
*florist, female
*florist, female
*florist, male
*florist, male
*Manager, female
*Manager, female
*Manager, female
*Manager, male
*nurse, female
*nurse, female
*nurse, male

What I want to wind up getting from my code is:

Job Title|  Gender  | All answers to gender | Job by gender
Florist  |  female  | 7                     | 2
Florist  |  male    | 4                     | 2
Manager  |  female  | 7                     | 3
Manager  |  male    | 4                     | 1
nurse    |  female  | 7                     | 2
nurse    |  male    | 4                     | 1

My current code (which does it in two steps) is

proc sql;
    create table newdata as
          select gender, job,
            count(gender) as GenderCount
          from datasource where gender in ('f','m')
          group by gender;

     select * from newdata;
quit

proc sql;
     create table newdata2 as 
            select *,
             count(gender) as JobsByGender
            from newdata
            group by q1, q21;

      select distinct * from newdata2 order by q21, q1;
      quit;

Upvotes: 1

Views: 2090

Answers (1)

J Petersen
J Petersen

Reputation: 162

You can join a subquery to get your expected result:

proc sql;
    create table newdata as
        select bd.job,
            bd.gender,
            sq.count_gender,
            count(bd.job) as JobsByGender
        from basedata bd
        inner join
            (select gender,
                 count(gender) as count_gender
             from base_data
             group by gender) sq
        on sq.gender = bd.gender
        group by bd.job, bd.gender;
quit;

The subquery creates a dataset that looks like

Gender | count_gender
---------------------
Male   | 4
Female | 7

The inner join matches up the correct number to each gender, as specified by on sq.gender = bd.gender

Honestly, most of this is just generic SQL. If you are going to be doing more proc sql statements in the future, I would highly suggest going over the tutorials that can be found at http://www.sqlcourse.com/ and http://www.sqlcourse2.com/

These will hopefully let you understand the query you are writing rather than using sample code you don't understand.

Upvotes: 1

Related Questions