Sawyer Gilbert
Sawyer Gilbert

Reputation: 9

Trying to convert a regression program from stata to python

I currently have this do file in stata which is a simple test for significance in a matched pairs regression. I understand some basic python but I did not know if something like this is possible in python given my limited knowledge. I am using this for my uncle who is using python for his company. If anyone can guide me to some resources or explain how I would do this please let me know.

*import delimited "data"

drop if missing(v1,v2,v3)

regress v3 v2

test v2

generate pvalue = r(p)

if pvalue > .01 {
display "notsig"
display pvalue
}

if pvalue <= .01 {
display "sig"
display pvalue
}

drop pvalue

Upvotes: 0

Views: 935

Answers (1)

Colton T
Colton T

Reputation: 328

I would look into pandas (http://pandas.pydata.org/pandas-docs/stable/) and statsmodels (http://www.statsmodels.org/dev/index.html). Pandas is good for reading data into dataframes in python, and then you can run statistical models with statsmodels. I am not well-versed in statsmodels, so you may have to look into the documentation yourself.

Here is an example, to try and go along with what you showed in your question:

import pandas as pd
import statsmodels.formula.api as sm

df = pd.read_csv("data.csv", sep=",")
df.dropna(axis=0, how='any')

results = sm.ols(formula="v3~v2", data=df).fit()
t_test = results.t_test('v2=0')

if (t_test.pvalue*2) > 0.01:
  print("notsig")
  print(t_test.pvalue*2)

if (t_test.pvalue*2) <= 0.01:
  print("sig")
  print(t_test.pvalue*2)

I took the pvalue*2 in this example, because I believe that it only gives the one-tail p-value, but you should check the documentation to make sure.

Upvotes: 1

Related Questions