Reputation: 180
My goal is to find the best predictor variable for winning a match. I have a slight knowledge of basic statistics, so I decided to use logistic regression because result of match is a binary variable.
logit_model=sm.Logit(y,X)
result=logit_model.fit()
result.summary()
This comes out with following result:
========================================================================
Model: Logit Pseudo R-squared: 0.515
Dependent Variable: win AIC: 92784.8133
Date: 2022-11-25 20:30 BIC: 92932.3349
No. Observations: 137967 Log-Likelihood: -46377.
Df Model: 14 LL-Null: -95631.
Df Residuals: 137952 LLR p-value: 0.0000
Converged: 1.0000 Scale: 1.0000
No. Iterations: 7.0000
------------------------------------------------------------------------
Coef. Std.Err. z P>|z| [0.025 0.975]
------------------------------------------------------------------------
kills 0.2116 0.0050 42.7261 0.0000 0.2019 0.2213
assists 0.2439 0.0025 98.3537 0.0000 0.2391 0.2488
deaths -0.4083 0.0039 -103.6498 0.0000 -0.4160 -0.4005
baronKills 0.7598 0.0338 22.4612 0.0000 0.6935 0.8261
dragonKills 0.3566 0.0157 22.6557 0.0000 0.3257 0.3874
timeCCingOthers -0.0096 0.0006 -17.2654 0.0000 -0.0107 -0.0085
wardsPlaced 0.0051 0.0012 4.1346 0.0000 0.0027 0.0076
goldEarned -0.0003 0.0000 -45.5422 0.0000 -0.0003 -0.0003
inhibitorTakedowns 2.1111 0.0212 99.5492 0.0000 2.0696 2.1527
largestKillingSpree -0.0504 0.0070 -7.2300 0.0000 -0.0641 -0.0367
largestMultiKill 0.4043 0.0159 25.5014 0.0000 0.3732 0.4354
totalMinionsKilled 0.0043 0.0002 21.6630 0.0000 0.0039 0.0047
consumablesPurchased -0.0395 0.0032 -12.3773 0.0000 -0.0458 -0.0333
damageDealtToBuildings 0.0002 0.0000 25.9200 0.0000 0.0001 0.0002
turretKills 0.3140 0.0131 23.8875 0.0000 0.2882 0.3397
========================================================================
What would be the best predictor for match win given these results? My initial thinking was I can't use the coefficient, because all variables come from different distributions. Is it valid thinking to use the z-score, since it standardizes values to the same distribution? Can variable assists and inhibitorTakedowns considered to be the best predictor for winning a match, since it has the highest z-score, or is this thinking flawed?
Upvotes: 0
Views: 480
Reputation: 531
The first thing that you should pay attention to is if a variable is significant or not given the level of threshold. It seems from your results that all are significant. Z-scores are used to see the level of significance. The second thing is to look at the coefficients to see which one has more impact on the label. The larger the absolute value of the coefficient, the more effect it will have. This effect may be positive or negative. The caveat here is, to standardize all predictor variables before putting them into the model. As you mentioned, they all come from different distributions. So, standardizing them will fix that problem.
Upvotes: 1