user3288051
user3288051

Reputation: 614

How can I get p values of each group comparison when applying the "Tukey’s Honestly Significant Difference"

By using the following code:

import statsmodels.stats.multicomp as multi
test = multi.MultiComparison(self.my_data[factor_var], self.my_data[grp_var])
res = test.tukeyhsd()
summary = res.summary()

I can get the result below:

enter image description here

What I need is the p values of each comparison. How can I get it? I would appreciate any help.

Upvotes: 1

Views: 2829

Answers (2)

user3288051
user3288051

Reputation: 614

it worked. Many thanks for your help.

Let me share my source code (benefited from this site)

import numpy as np
from statsmodels.stats.multicomp import (pairwise_tukeyhsd, MultiComparison)
from statsmodels.stats.libqsturng import psturng

dta2 = np.rec.array([
(  1,   'mental',  2 ),
(  2,   'mental',  2 ),
(  3,   'mental',  3 ),
(  4,   'mental',  4 ),
(  5,   'mental',  4 ),
(  6,   'mental',  5 ),
(  7,   'mental',  3 ),
(  8,   'mental',  4 ),
(  9,   'mental',  4 ),
( 10,   'mental',  4 ),
( 11, 'physical',  4 ),
( 12, 'physical',  4 ),
( 13, 'physical',  3 ),
( 14, 'physical',  5 ),
( 15, 'physical',  4 ),
( 16, 'physical',  1 ),
( 17, 'physical',  1 ),
( 18, 'physical',  2 ),
( 19, 'physical',  3 ),
( 20, 'physical',  3 ),
( 21,  'medical',  1 ),
( 22,  'medical',  2 ),
( 23,  'medical',  2 ),
( 24,  'medical',  2 ),
( 25,  'medical',  3 ),
( 26,  'medical',  2 ),
( 27,  'medical',  3 ),
( 28,  'medical',  1 ),
( 29,  'medical',  3 ),
( 30,  'medical',  1 )], dtype=[('idx', '<i4'),
                                ('Treatment', '|S8'),
                                ('StressReduction', '<i4')])

print("Using the pairwise_tukeyhsd Method")
print("----------------------------------------------------------")
res2 = pairwise_tukeyhsd(dta2['StressReduction'], dta2['Treatment'])
print("summary:", res2.summary())
print("mean diffs:", res2.meandiffs)
print("std pairs:",res2.std_pairs)
print("groups unique: ", res2.groupsunique)
print("df total:", res2.df_total)
p_values = psturng(np.abs(res2.meandiffs / res2.std_pairs), len(res2.groupsunique), res2.df_total)
print()
print("p values:", p_values)

print()

print("Using the MultiComparison Method")
print("----------------------------------------------------------")
test = MultiComparison(dta2['StressReduction'], dta2['Treatment'])
tukey_res = test.tukeyhsd()
summary = tukey_res.summary()
print("summary:", summary)
print("mean diffs:", tukey_res.meandiffs)
print("std pairs:",tukey_res.std_pairs)
print("groups unique: ", tukey_res.groupsunique)
print("df total:", tukey_res.df_total)
p_values = psturng(np.abs(tukey_res.meandiffs / tukey_res.std_pairs), len(tukey_res.groupsunique), tukey_res.df_total)
print()
print("p values:", p_values)

Upvotes: 0

user8513380
user8513380

Reputation:

Possible duplicate: Click here for details

There is not direct function to call to get pvalues:

psturng(np.abs(res.meandiffs / res.std_pairs), len(res.groupsunique), res.df_total)

where res is pairwise_tukeyhsd, and psturng is function from statsmodels.stats.libqsturng

Upvotes: 3

Related Questions