Reputation: 59
I have asked this question before but haven't found an answer yet. I am trying to create a bar group in SAS which shows the percentage of patients that received a test by category and within in bar, show the location where the tests were received (location). My dataset looks like this:
Category Test Test_location
High Risk 1 Site 1
Intermediate Risk 1 Site 2
Low Risk 0 .
Intermediate Risk 0 .
High Risk 1 Site 3
Where each patient is listed with the risk classification they have been assigned to (variable 'Category'), an indicator variable that shows whether or not they received a test (variable 'test' where '1'=received test and '0'=did not receive test) and, if they received a test, where that test took place (variable 'test_location').
I want to create a bar graph with the categories on the x axis and the yaxis showing the percentage of patients who got a test (test=1), and then each bar shaded to show the composition of patients who got a test in each category for location (ie: how many tests occurred in Site 1, 2 and 3).
I have the below code, but it is not giving me the percentages that I want. It gives me a pct_col output of test*category, and I want pct_row. In other words, I want the y axis to measure the percentage of patients with testing out of the total number of patients in each category, not out of all patients who receiving testing in any category like it is giving me.
Example of what I want: In the dummy dataset below, for high risk patients, for example, I want a bar that shows 75% (12 patients with tests out of the total 16 high risk patients) received tests, and then have the bar shaded to show 41.66% of those test were at Site 1, 33.34% at Site 2 and 25% at Site 3. And so on for the intermediate and low risk categories. If there is a way to label the subsections with the exact percentages, that would be great too.
Dummy data set:
data test;
infile datalines missover;
input ID Category $ Test Test_location $;
datalines;
1 High 1 Site_1
2 High 1 Site_1
3 High 1 Site_1
4 High 1 Site_1
5 High 1 Site_1
6 High 1 Site_2
7 High 1 Site_2
8 High 1 Site_2
9 High 1 Site_2
10 High 1 Site_3
11 High 1 Site_3
12 High 1 Site_3
13 High 0
14 High 0
15 High 0
16 High 0
17 Intermediate 1 Site_1
18 Intermediate 1 Site_1
19 Intermediate 1 Site_2
20 Intermediate 0
21 Intermediate 0
22 Intermediate 0
23 Intermediate 0
24 Intermediate 0
25 Intermediate 0
26 Low 1 Site_1
27 Low 1 Site_1
28 Low 1 Site_1
29 Low 1 Site_2
30 Low 1 Site_2
31 Low 1 Site_2
32 Low 1 Site_3
33 Low 0
34 Low 0
35 Low 0
36 Low 0
37 Low 0
38 Low 0
;
Thank you!
EDIT;
Here is a sample graph of what I am looking to output in SAS (using the dummy data above):
Using this code:
proc sgplot data=test pctlevel=graph;
vbar category / response=test stat=percent
group=test_location groupdisplay=stack datalabel;
keylegend /title="Testing Location" position=bottom;
quit;
I get this output:
So what I have is not giving my the correct denominators for my percents. I also couldn't figure out a way to label the individual subsections of the graph like I have in my sample figure.
Thank you!
Upvotes: 3
Views: 4363
Reputation: 3136
You can get exactly what you want by using a bit of data step and some formatting. It would be a little bit different from your working code. As others have pointed out, there are many useful examples at Robert Allison's site.
I'd go with the simple solution below, which is almost exactly what you asked for, and very close to your working code. The main difference is that the missing values are their own category.
The key lines are:
pctlevel=group
missing
Here is the code:
proc sgplot data = test
pctlevel = group
;
vbar category /
stat = percent
group = test_location
grouporder = data
missing
seglabel
;
keylegend /
title = "Testing Location"
position = bottom
;
quit;
I get:
Upvotes: 4