user2117897
user2117897

Reputation: 95

Using Hmisc package (R) to produce proc report (SAS) like output?

I have been trying to use the Hmisc package to produce output similar to below.

                                            Group
Step      Method                   G1           G2         G3 .......   

s1          m1          N          45            26         17
                       Min          2             2         3
                       Max          7             6         4
                       Mean         3.5          4.5        2.5
                       Sdev         2.6          3.6         1

            m2          N          
                       Min          
                       Max          
                       Mean        
                       Sdev  

s2          m1          N          
                       Min          
                       Max        
                       Mean         
                       Sdev        

            m2          N          
                       Min          
                       Max          
                       Mean        
                       Sdev    

My raw data looks like below.

       Site    Step  Method   Group   Outcome
        a1      s1     m1      g1      3.6
        a1      s1     m4      g1      2.3
        a2      s2     m1      g2      14
        a3      s1     m3      g1      7
        a3      s3     m6      g1      1
        a4      s1     m1      g3      6.2

I am trying to compute the n, min, mean, sdev, and max for all the site outcomes in each group,by step and method. I am using the sites as my unique identifiers. Not every site has every step, and not every step has every method, so there are missing values. I have been playing with the Hmisc package, and have been able to compute the n, mean, min, and max using fun=summary, but I have only been able to do it for each method individually, and it is displayed in a not so pretty matrix. I know that the package uses latex (I am total novice with this), and I have used the option in summary(....,file="data.tex") I think it is to save a .dvi file, which I right click on and tell it to covert to pdf, but the pdf is all broken looking with data in the wrong place. I really don't know what I am doing wrong, so any feedback/input is greatly appreciated. Cheers.

Upvotes: 4

Views: 1899

Answers (2)

adibender
adibender

Reputation: 7578

The tabular function in the tables package was ment to create SAS like tables. You can try something like this (dat beeing your example data):

library(tables)
(tab1 <- tabular(Step*Method*Heading()*Outcome*((n = 1) + min + max + mean + sd) ~ Group, 
        data = dat))

                  Group          
 Step Method      g1    g2   g3  
 s1   m1     n     1.0     0  1.0
             min   3.6   Inf  6.2
             max   3.6  -Inf  6.2
             mean  3.6   NaN  6.2
             sd     NA    NA   NA
      m3     n     1.0     0  0.0
             min   7.0   Inf  Inf
             max   7.0  -Inf -Inf
             mean  7.0   NaN  NaN
             sd     NA    NA   NA
             ...   ...   ...  ...

To further process the data, with latex for example, latex(tab1) creates a nicely formated latex tabular.

NOTE: You can easily improve the Formating of the table like this:

tabular(Step*RowFactor(Method, levelnames = c("M1", "M2", "M3", "M4"), spacing = 1)*
                Heading()*Outcome*
                (Format()*(N= 1) + (Min = min) + (Max = max) + (Mean = mean) + 
                    (Sdev = sd)) ~ 
                Factor(Group, levelnames = c("G1", "G2", "G3")), 
        data = dat)

also applying this to all Sites is straight forward, using tabular(Site*Step*...)

Upvotes: 4

Anthony Damico
Anthony Damico

Reputation: 6124

i'm assuming you don't care about the formatting (which might be incorrect), you could just use the aggregate function :)

# run any function, grouped by whatever variables you want..
aggregate( Outcome ~ Step + Method + Group , data = x , summary )

# the summary function doesn't include standard deviations,
# so run that separately
aggregate( Outcome ~ Step + Method + Group , data = x , sd )

assuming your data looks like this..

# read in your data
x <- read.table( h = T , text = "Site    Step  Method   Group   Outcome
        a1      s1     m1      g1      3.6
        a1      s1     m4      g1      2.3
        a2      s2     m1      g2      14
        a3      s1     m3      g1      7
        a3      s3     m6      g1      1
        a4      s1     m1      g3      6.2")

if it's just performing a task by group, look at ?aggregate and ?tapply and in the future include groupwise in your search terms.

if you want to run it all in one line, you can create a quick custom function that just lumps the output of summary together with the output of sd..

# alternatively, you can tack a standard deviation onto the summary function if you like..
swsd <- function( x ) c( summary( x ) , sd( x ) )

# ..and then run that through `aggregate` instead :)
aggregate( Outcome ~ Step + Method + Group , data = x , swsd )

Upvotes: 3

Related Questions