Scipy Stats ttest_1samp Hypothesis Testing For Comparing Previous Performance To Sample

Question

My Problem I'm Trying To Solve

I have 11 months worth of performance data:

        Month  Branded  Non-Branded  Shopping  Grand Total
0    2/1/2015     1330          334       161         1825
1    3/1/2015     1344          293       197         1834
2    4/1/2015      899          181       190         1270
3    5/1/2015      939          208       154         1301
4    6/1/2015     1119          238       179         1536
5    7/1/2015      859          238       170         1267
6    8/1/2015      996          340       183         1519
7    9/1/2015     1138          381       172         1691
8   10/1/2015     1093          395       176         1664
9   11/1/2015     1491          426       199         2116
10  12/1/2015     1539          530       156         2225

Let's say it's February, 1 2016 and I'm asking "are the results in January statistically different from the past 11 months?"

       Month  Branded  Non-Branded  Shopping  Grand Total
11  1/1/2016     1064          408       106         1578

I came across a blog...

I came across iaingallagher's blog. I will reproduce here (in case the blog goes down).

1-sample t-test

The 1-sample t-test is used when we want to compare a sample mean to a population mean (which we already know). The average British man is 175.3 cm tall. A survey recorded the heights of 10 UK men and we want to know whether the mean of the sample is different from the population mean.

# 1-sample t-test
from scipy import stats
one_sample_data = [177.3, 182.7, 169.6, 176.3, 180.3, 179.4, 178.5, 177.2, 181.8, 176.5]

one_sample = stats.ttest_1samp(one_sample_data, 175.3)

print "The t-statistic is %.3f and the p-value is %.3f." % one_sample

Result:

The t-statistic is 2.296 and the p-value is 0.047.

Finally, to my question...

In iaingallagher's example, he knows the population mean and is comparing a sample (one_sample_data). In MY example, I want to see if 1/1/2016 is statistically different from the previous 11 months. So in my case, the previous 11 months is an array (instead of a single population mean value) and my sample is one data point (instead of an array)... so it's kind of backwards.

QUESTION

If I was focused on the Shopping column data:

Will scipy.stats.ttest_1samp([161,197,190,154,179,170,183,172,176,199,156], 106) produce a valid result even though my sample (first parameters) is a list of previous results and I'm comparing it to a popmean that's not the population mean but instead one sample.

If this is not the correct stats function, any recommendation on what to use for this hypothesis test situation?

Scipy Stats ttest_1samp Hypothesis Testing For Comparing Previous Performance To Sample

Answers (1)

Related Questions