jon
jon

Reputation: 11366

Construct ANOVA like object

I am writing functions to output ANOVA as output

I did not understand how to output anova object from the following information:

# degrees of freedom 
    repdf = 1
    trtdf = 22
    totaldf = 23
 # sum of square    
    ssrep = 10.3
    sstrt = 14567.2
    sstotal = 14577.2

Is anova object dataframe or list or there other special programming category?

Edits: based on the suggestion below from Ben

Source <- c("replication", "Treatments", "Total") 
Df <- c(repdf, trtdf, totaldf)
"Sum Sq" <- c(ssrep, sstrt, sstotal)
anovadf <- data.frame(Source, Df, "Sum Sq")
class(anovadf) <- c("anova","data.frame")

Does not give me what str of the anova object should look like? Any further help

> str(anovadf)
Classes ‘anova’ and 'data.frame':       3 obs. of  3 variables:
 $ Source   : Factor w/ 3 levels "Error","replication",..: 2 3 1
 $ Df       : num  1 22 23
 $ X.Sum.Sq.: Factor w/ 1 level "Sum Sq": 1 1 1

Upvotes: 2

Views: 293

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226172

Create an anova object, save it, then use str() on the results. From the lm.D9 object created by example("lm"):

> str(anova(lm.D9))
Classes ‘anova’ and 'data.frame':   2 obs. of  5 variables:
 $ Df     : int  1 18
 $ Sum Sq : num  0.688 8.729
 $ Mean Sq: num  0.688 0.485
 $ F value: num  1.42 NA
 $ Pr(>F) : num  0.249 NA
 - attr(*, "heading")= chr  "Analysis of Variance Table\n" "Response: weight"

So it's a special case of a data frame. Construct your data frame a to match the example and then try assigning the class: class(a) <- c("anova","data.frame").

In particular:

Df <- c(repdf, trtdf, totaldf)
ssq <- c(ssrep, sstrt, sstotal)

anovadf <- data.frame(Df, `Sum Sq`=ssq, `Mean Sq`=ssq/Df, check.names=FALSE)
rownames(anovadf) <- c("replication","treatments","total")
class(anovadf) <- c("anova","data.frame")

anovadf
            Df  Sum Sq Mean Sq
replication  1    10.3   10.30
treatments  22 14567.2  662.15
total       23 14577.2  633.79

You have to be a little bit careful with the column names -- they have to protected by backticks, and you have to use check.names=FALSE, because they are not legal variable names (they contain spaces). You could add the F statistic and P value to this -- I didn't because I wasn't sure what the appropriate error term was.

Upvotes: 4

Related Questions