Reputation: 9
I currently have this do file in stata which is a simple test for significance in a matched pairs regression. I understand some basic python but I did not know if something like this is possible in python given my limited knowledge. I am using this for my uncle who is using python for his company. If anyone can guide me to some resources or explain how I would do this please let me know.
*import delimited "data"
drop if missing(v1,v2,v3)
regress v3 v2
test v2
generate pvalue = r(p)
if pvalue > .01 {
display "notsig"
display pvalue
}
if pvalue <= .01 {
display "sig"
display pvalue
}
drop pvalue
Upvotes: 0
Views: 935
Reputation: 328
I would look into pandas
(http://pandas.pydata.org/pandas-docs/stable/) and statsmodels
(http://www.statsmodels.org/dev/index.html). Pandas is good for reading data into dataframes in python, and then you can run statistical models with statsmodels. I am not well-versed in statsmodels, so you may have to look into the documentation yourself.
Here is an example, to try and go along with what you showed in your question:
import pandas as pd
import statsmodels.formula.api as sm
df = pd.read_csv("data.csv", sep=",")
df.dropna(axis=0, how='any')
results = sm.ols(formula="v3~v2", data=df).fit()
t_test = results.t_test('v2=0')
if (t_test.pvalue*2) > 0.01:
print("notsig")
print(t_test.pvalue*2)
if (t_test.pvalue*2) <= 0.01:
print("sig")
print(t_test.pvalue*2)
I took the pvalue*2 in this example, because I believe that it only gives the one-tail p-value, but you should check the documentation to make sure.
Upvotes: 1