Reputation: 35
I want to estimate cox models but when I try to run the code ,I have an error. it seems this problem about the coxphfitter().does any one here that solve this problem. I think the lifelines library can not compute coefficients with ML method .So here I copy errors and sample code .I should to say I write the code just for example and inputs not reall.
code
df_l=df[['Observed','HighLTV','Liquidation']]
df_c=df[['Observed','HighLTV','Cure']]
cph_l=CoxPHFitter()
cph_c=CoxPHFitter()
cph_l.fit(df_l,'Observed',event_col='Liquidation')
cph_c.fit(df_c,'Observed',event_col='Cure')
beta_cure=float('{:.3f}'.format((cph_c.params_[0])))
beta_liquidation=float('{:.3f}'.format((cph_l.params_[0])))
error
LinAlgError Traceback (most recent call last)
~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in _newton_rhapson_for_efron_model(self, X, T, E, weights, entries, initial_point, step_size, precision, show_progress, max_steps)
1497 try:
-> 1498 inv_h_dot_g_T = spsolve(-h, g, assume_a="pos", check_finite=False)
1499 except (ValueError, LinAlgError) as e:
~\anaconda3\lib\site-packages\scipy\linalg\basic.py in solve(a, b, sym_pos, lower, overwrite_a, overwrite_b, debug, check_finite, assume_a, transposed)
247 overwrite_b=overwrite_b)
--> 248 _solve_check(n, info)
249 rcond, info = pocon(lu, anorm)
~\anaconda3\lib\site-packages\scipy\linalg\basic.py in _solve_check(n, info, lamch, rcond)
28 elif 0 < info:
---> 29 raise LinAlgError('Matrix is singular.')
30
LinAlgError: Matrix is singular.
During handling of the above exception, another exception occurred:
ConvergenceError Traceback (most recent call last)
<ipython-input-145-7cb92b8db8fe> in <module>
8 k.append(list(map(lambda x: random.choice(o),range(10))))
9 s=pd.DataFrame(k[i],columns=df.columns)
---> 10 c.append(CCR(s))
<ipython-input-144-da506c585def> in CCR(data)
30 cph_c=CoxPHFitter()
31 cph_l.fit(df_l,'Observed',event_col='Liquidation')
---> 32 cph_c.fit(df_c,'Observed',event_col='Cure')
33 beta_cure=float('{:.3f}'.format((cph_c.params_[0])))
34 beta_liquidation=float('{:.3f}'.format((cph_l.params_[0])))
~\anaconda3\lib\site-packages\lifelines\utils\__init__.py in f(model, *args, **kwargs)
52 def f(model, *args, **kwargs):
53 cls.set_censoring_type(model, cls.RIGHT)
---> 54 return function(model, *args, **kwargs)
55
56 return f
~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in fit(self, df, duration_col, event_col, show_progress, initial_point, strata, step_size, weights_col, cluster_col, robust, batch_mode, timeline, formula, entry_col)
274 """
275 self.strata = utils.coalesce(strata, self.strata)
--> 276 self._model = self._fit_model(
277 df,
278 duration_col,
~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in _fit_model(self, *args, **kwargs)
595 def _fit_model(self, *args, **kwargs):
596 if self.baseline_estimation_method == "breslow":
--> 597 return self._fit_model_breslow(*args, **kwargs)
598 elif self.baseline_estimation_method == "spline":
599 return self._fit_model_spline(*args, **kwargs)
~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in _fit_model_breslow(self, *args, **kwargs)
608 )
609 if utils.CensoringType.is_right_censoring(self):
--> 610 model.fit(*args, **kwargs)
611 return model
612 else:
~\anaconda3\lib\site-packages\lifelines\utils\__init__.py in f(model, *args, **kwargs)
52 def f(model, *args, **kwargs):
53 cls.set_censoring_type(model, cls.RIGHT)
---> 54 return function(model, *args, **kwargs)
55
56 return f
~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in fit(self, df, duration_col, event_col, show_progress, initial_point, strata, step_size, weights_col, cluster_col, robust, batch_mode, timeline, formula, entry_col)
1225 )
1226
-> 1227 params_, ll_, variance_matrix_, baseline_hazard_, baseline_cumulative_hazard_, model = self._fit_model(
1228 X_norm,
1229 T,
~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in _fit_model(self, X, T, E, weights, entries, initial_point, step_size, show_progress)
1353 show_progress: bool = True,
1354 ):
-> 1355 beta_, ll_, hessian_ = self._newton_rhapson_for_efron_model(
1356 X, T, E, weights, entries, initial_point=initial_point, step_size=step_size, show_progress=show_progress
1357 )
~\anaconda3\lib\site-packages\lifelines\fitters\coxph_fitter.py in _newton_rhapson_for_efron_model(self, X, T, E, weights, entries, initial_point, step_size, precision, show_progress, max_steps)
1505 )
1506 elif isinstance(e, LinAlgError):
-> 1507 raise exceptions.ConvergenceError(
1508 """Convergence halted due to matrix inversion problems. Suspicion is high collinearity. {0}""".format(
1509 CONVERGENCE_DOCS
ConvergenceError: Convergence halted due to matrix inversion problems. Suspicion is high collinearity. Please see the following tips in the lifelines documentation: https://lifelines.readthedocs.io/en/latest/Examples.html#problems-with-convergence-in-the-cox-proportional-hazard-modelMatrix is singular.
Upvotes: 0
Views: 639
Reputation: 78
I am not familiar with the package you are using here, but I notice assume_a="pos"
is included in the spsolve
call, hinting to the solver that the matrix is positive definite. In my own work, interacting with the scipy solvers directly, I've noticed that the matrix soundness checks can be stricter when this option is set, and that omitting this option (implicitly setting assume_a="gen"
) may yield a usable solution, even for positive definite matrices with undesirably high collinearity.
Upvotes: 0
Reputation: 71
The given clearly states the problem:
ConvergenceError: Convergence halted due to matrix inversion problems. Suspicion is high collinearity. Please see the following tips in the lifelines documentation: https://lifelines.readthedocs.io/en/latest/Examples.html#problems-with-convergence-in-the-cox-proportional-hazard-modelMatrix is singular.
Without the real data I can't give any further advice. But the lifelines documentation gives a lot of advice on this issue:
Convergence halted due to matrix inversion problems: This means that there is high collinearity in your dataset. That is, a column is equal to the linear combination of 1 or more other columns. A common cause of this error is dummying categorical variables but not dropping a column, or some hierarchical structure in your dataset. Try to find the relationship by: adding a penalizer to the model, ex: CoxPHFitter(penalizer=0.1).fit(…) until the model converges. In the print_summary(), the coefficients that have high collinearity will have large (absolute) magnitude in the coefs column. using the variance inflation factor (VIF) to find redundant variables. looking at the correlation matrix of your dataset, or
This is very likely not an error caused by lifelines instead it is your data or how you apply the model on your data.
Upvotes: 1