Rachel Cyr
Rachel Cyr

Reputation: 449

AttributeError: 'str' object has no attribute 'decode' in Binary Logistic Regression

I am working on Binary Logistic regression (with completely categorical data) I have OneHotEncoded it and attempt to run Binary logistic regression. I am getting this error below and I have no idea how to deal with errors. I understand it gives you some information in the last line but I don't where there could possibly be str values here?

[IN]: train_set, test_set = train_test_split(allyrs, test_size = 0.2, random_state = 42)

Set up predictors, X is used for both Binary and Multi

[In] X = train_set.iloc[:, 31 : 175]

# Set up binary y value
[IN]: y=train_set.iloc[:, 29]

# Set up multi y value
 [IN]: ym=train_set.iloc[:, 30]

# first attempt to feed through is says :
[IN]:
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import LogisticRegressionCV


[IN]:BiLog_cv = LogisticRegressionCV(cv=3, random_state=0).fit(X, y)

[OUT]:

AttributeError                            Traceback (most recent call last)
<ipython-input-19-2468362218dc> in <module>
      1 # Binary Logistic Regresiion with Cross Validation - training set
      2     # Fit Model
----> 3 BiLog_cv = LogisticRegressionCV(cv=3, random_state=0).fit(X, y)

~\anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py in fit(self, X, y, sample_weight)
   1883             prefer = 'processes'
   1884 
-> 1885         fold_coefs_ = Parallel(n_jobs=self.n_jobs, verbose=self.verbose,
   1886                                **_joblib_parallel_args(prefer=prefer))(
   1887             path_func(X, y, train, test, pos_class=label, Cs=self.Cs,

~\anaconda3\lib\site-packages\joblib\parallel.py in __call__(self, iterable)
   1039             # remaining jobs.
   1040             self._iterating = False
-> 1041             if self.dispatch_one_batch(iterator):
   1042                 self._iterating = self._original_iterator is not None
   1043 

~\anaconda3\lib\site-packages\joblib\parallel.py in dispatch_one_batch(self, iterator)
    857                 return False
    858             else:
--> 859                 self._dispatch(tasks)
    860                 return True
    861 

~\anaconda3\lib\site-packages\joblib\parallel.py in _dispatch(self, batch)
    775         with self._lock:
    776             job_idx = len(self._jobs)
--> 777             job = self._backend.apply_async(batch, callback=cb)
    778             # A job can complete so quickly than its callback is
    779             # called before we get here, causing self._jobs to

~\anaconda3\lib\site-packages\joblib\_parallel_backends.py in apply_async(self, func, callback)
    206     def apply_async(self, func, callback=None):
    207         """Schedule a func to be run"""
--> 208         result = ImmediateResult(func)
    209         if callback:
    210             callback(result)

~\anaconda3\lib\site-packages\joblib\_parallel_backends.py in __init__(self, batch)
    570         # Don't delay the application, to avoid keeping the input
    571         # arguments in memory
--> 572         self.results = batch()
    573 
    574     def get(self):

~\anaconda3\lib\site-packages\joblib\parallel.py in __call__(self)
    260         # change the default number of processes to -1
    261         with parallel_backend(self._backend, n_jobs=self._n_jobs):
--> 262             return [func(*args, **kwargs)
    263                     for func, args, kwargs in self.items]
    264 

~\anaconda3\lib\site-packages\joblib\parallel.py in <listcomp>(.0)
    260         # change the default number of processes to -1
    261         with parallel_backend(self._backend, n_jobs=self._n_jobs):
--> 262             return [func(*args, **kwargs)
    263                     for func, args, kwargs in self.items]
    264 

~\anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py in _log_reg_scoring_path(X, y, train, test, pos_class, Cs, scoring, fit_intercept, max_iter, tol, class_weight, verbose, solver, penalty, dual, intercept_scaling, multi_class, random_state, max_squared_sum, sample_weight, l1_ratio)
    963         sample_weight = sample_weight[train]
    964 
--> 965     coefs, Cs, n_iter = _logistic_regression_path(
    966         X_train, y_train, Cs=Cs, l1_ratio=l1_ratio,
    967         fit_intercept=fit_intercept, solver=solver, max_iter=max_iter,

~\anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py in _logistic_regression_path(X, y, pos_class, Cs, fit_intercept, max_iter, tol, verbose, solver, coef, class_weight, dual, penalty, intercept_scaling, multi_class, random_state, check_input, max_squared_sum, sample_weight, l1_ratio)
    760                 options={"iprint": iprint, "gtol": tol, "maxiter": max_iter}
    761             )
--> 762             n_iter_i = _check_optimize_result(
    763                 solver, opt_res, max_iter,
    764                 extra_warning_msg=_LOGISTIC_SOLVER_CONVERGENCE_MSG)

~\anaconda3\lib\site-packages\sklearn\utils\optimize.py in _check_optimize_result(solver, result, max_iter, extra_warning_msg)
    241                 "    https://scikit-learn.org/stable/modules/"
    242                 "preprocessing.html"
--> 243             ).format(solver, result.status, result.message.decode("latin1"))
    244             if extra_warning_msg is not None:
    245                 warning_msg += "\n" + extra_warning_msg

AttributeError: 'str' object has no attribute 'decode'

Can someone give me some insight please, I just ran this dataset through Categorical NB and got it to work.

Thank you

Upvotes: 1

Views: 2912

Answers (2)

Alex Serra Marrugat
Alex Serra Marrugat

Reputation: 2042

In the most recent version of scikit-learn (now 0.24.1) the problem has been fixed enclosing a part of code in a try-catch block. This was explained in more detail by Gigioz in this stackoverflow question.

To upgrade scikit-learn use the code below:

pip install -U scikit-learn

And restart the kernel.

Upvotes: 5

Frank Yellin
Frank Yellin

Reputation: 11330

I know nothing about this package, but your result appears to be on line 243. The function decode is called on a byte array to convert it into a string. But it appears that result.message is already a string. Try just deleting .decode("latin1").

Upvotes: 0

Related Questions