Reputation: 95
I have been trying to use the Hmisc package to produce output similar to below.
Group
Step Method G1 G2 G3 .......
s1 m1 N 45 26 17
Min 2 2 3
Max 7 6 4
Mean 3.5 4.5 2.5
Sdev 2.6 3.6 1
m2 N
Min
Max
Mean
Sdev
s2 m1 N
Min
Max
Mean
Sdev
m2 N
Min
Max
Mean
Sdev
My raw data looks like below.
Site Step Method Group Outcome
a1 s1 m1 g1 3.6
a1 s1 m4 g1 2.3
a2 s2 m1 g2 14
a3 s1 m3 g1 7
a3 s3 m6 g1 1
a4 s1 m1 g3 6.2
I am trying to compute the n, min, mean, sdev, and max for all the site outcomes in each group,by step and method. I am using the sites as my unique identifiers. Not every site has every step, and not every step has every method, so there are missing values. I have been playing with the Hmisc package, and have been able to compute the n, mean, min, and max using fun=summary
,
but I have only been able to do it for each method individually, and it is displayed in a not so pretty matrix. I know that the package uses latex (I am total novice with this), and I have used the option in summary(....,file="data.tex")
I think it is to save a .dvi file, which I right click on and tell it to covert to pdf, but the pdf is all broken looking with data in the wrong place. I really don't know what I am doing wrong, so any feedback/input is greatly appreciated.
Cheers.
Upvotes: 4
Views: 1899
Reputation: 7578
The tabular
function in the tables
package was ment to create SAS like tables.
You can try something like this (dat
beeing your example data):
library(tables)
(tab1 <- tabular(Step*Method*Heading()*Outcome*((n = 1) + min + max + mean + sd) ~ Group,
data = dat))
Group
Step Method g1 g2 g3
s1 m1 n 1.0 0 1.0
min 3.6 Inf 6.2
max 3.6 -Inf 6.2
mean 3.6 NaN 6.2
sd NA NA NA
m3 n 1.0 0 0.0
min 7.0 Inf Inf
max 7.0 -Inf -Inf
mean 7.0 NaN NaN
sd NA NA NA
... ... ... ...
To further process the data, with latex for example, latex(tab1)
creates a nicely formated latex tabular.
NOTE: You can easily improve the Formating of the table like this:
tabular(Step*RowFactor(Method, levelnames = c("M1", "M2", "M3", "M4"), spacing = 1)*
Heading()*Outcome*
(Format()*(N= 1) + (Min = min) + (Max = max) + (Mean = mean) +
(Sdev = sd)) ~
Factor(Group, levelnames = c("G1", "G2", "G3")),
data = dat)
also applying this to all Sites is straight forward, using tabular(Site*Step*...)
Upvotes: 4
Reputation: 6124
i'm assuming you don't care about the formatting (which might be incorrect), you could just use the aggregate
function :)
# run any function, grouped by whatever variables you want..
aggregate( Outcome ~ Step + Method + Group , data = x , summary )
# the summary function doesn't include standard deviations,
# so run that separately
aggregate( Outcome ~ Step + Method + Group , data = x , sd )
assuming your data looks like this..
# read in your data
x <- read.table( h = T , text = "Site Step Method Group Outcome
a1 s1 m1 g1 3.6
a1 s1 m4 g1 2.3
a2 s2 m1 g2 14
a3 s1 m3 g1 7
a3 s3 m6 g1 1
a4 s1 m1 g3 6.2")
if it's just performing a task by group, look at ?aggregate
and ?tapply
and in the future include groupwise
in your search terms.
if you want to run it all in one line, you can create a quick custom function that just lumps the output of summary
together with the output of sd
..
# alternatively, you can tack a standard deviation onto the summary function if you like..
swsd <- function( x ) c( summary( x ) , sd( x ) )
# ..and then run that through `aggregate` instead :)
aggregate( Outcome ~ Step + Method + Group , data = x , swsd )
Upvotes: 3