Power analysis for multiple regression using pwr and R

Question

I want to determine the sample size necessary to detect an effect of an interaction term of two continuous variables (scaled) in a multiple regression with other covariates.

We have found an effect where previous smaller studies have failed. These effects are small, but a reviewer is asking us say that previous studies were probably underpowered, and to provide some measure to support that.

I am using the pwr.f2.test() function in the pwr package, as follows:

pwr.f2.test(u = nominator, v = denominator, f2 = effect size, sig.level = 0.05, power = 0.8), and the denominator I set to NULL so I can get sample size.

Here is my model output from summary():

                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)        -21.2333    20.8127   -1.02  0.30800    
age                  0.0740     0.0776    0.95  0.34094    
wkdemand             1.6333     0.5903    2.77  0.00582 ** 
hoops                0.8662     0.6014    1.44  0.15028    
wtlift               5.2417     1.3912    3.77  0.00018 ***
height05             0.2205     0.0467    4.72  2.9e-06 ***
amtRS                0.1041     0.2776    0.37  0.70779    
allele1_numS        -0.0731     0.2779   -0.26  0.79262    
amtRS:allele1_numS   0.6267     0.2612    2.40  0.01670 *  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 7.17 on 666 degrees of freedom
Multiple R-squared:  0.0769,    Adjusted R-squared:  0.0658 
F-statistic: 6.94 on 8 and 666 DF,  p-value: 8.44e-09

And the model effects sizes estimates from modelEffectSizes() function in lmSupport package:

Coefficients
                         SSR df pEta-sqr dR-sqr
(Intercept)          53.5593  1   0.0016     NA
age                  46.7344  1   0.0014 0.0013
wkdemand            393.9119  1   0.0114 0.0106
hoops               106.7318  1   0.0031 0.0029
wtlift              730.5385  1   0.0209 0.0197
height05           1145.0394  1   0.0323 0.0308
amtRS                 7.2358  1   0.0002 0.0002
allele1_numS          3.5599  1   0.0001 0.0001
amtRS:allele1_numS  296.2219  1   0.0086 0.0080

Sum of squared errors (SSE): 34271.3
Sum of squared total  (SST): 37127.3

The question:

What value do I put in the f2 slot of pwr.f2.test()? I take it the numerator is going to be 1, and I should use the pEta-sqr from modelEffectSizes(), so in this case 0.0086?

Also, the estimated sample sizes I get are often much larger than our sample size 675 - does this mean we were 'lucky' to have picked up a significant effects (we'll only detect them 50% of the time, given the effect size)? Note that I we have multiple measures of different things all pointing to the same finding, so I'm relatively satisfied there.

Oka · Accepted Answer

What value do I put in the f2 slot of pwr.f2.test()?

For each of pwr functions, you enter three of the four quantities (effect size, sample size, significance level, power) and the fourth will be calculated (1). In pwr.f2.test u and v are the numerator and denominator degrees of freedom. And f2 is used as the effect size measure. E.g. you will put there an effect size estimate.

Is pEta-sqr the correct 'effect size' to use?

Now, there are many different effect size measures. Pwr uses specifically Cohen´s F ² and it is different from pEta-sqr, so I wouldn´t recommend it.

Which effect size measure I could use then?

As @42- mentioned, you could try to use delta-R² effect, which in your output variables is labeled “dR-sqr”. You could do this with variation of Cohen’s f ² measuring local effect size which was described by Selya et al. (2012). It uses the following equation:

f^2=(R^2(AB)-R^2(A))/(1-R^2(AB))

In the equation, B is the variable of interest, A is the set of all other variables , R²_AB is the proportion of variance accounted for by A and B together (relative to a model with no regressors), and R²_A is the proportion of variance accounted for by A (relative to a model with no regressors). I would do as @42- suggested – e.g. build two models, one with the interaction and one without and use their delta-R² effect size.

Importantly, as @42- correctly pointed out, if the reviewers ask you if prior studies were underpowered, you need to use the sample sizes of those studies to make any power calculation. If you are using parameters of your own study, first of all you already know the answer – that you did have sufficient power to detect a difference, and second, you are doing it post hoc which also doesn´t sound correct.

https://www.statmethods.net/stats/power.html
Selya et al., 2012: A Practical Guide to Calculating Cohen’s f2, a Measure of Local Effect Size, from PROC MIXED. Front Psychol. 2012;3:111.

Power analysis for multiple regression using pwr and R

Answers (1)

Related Questions