Larsq
Larsq

Reputation: 365

How to log cross validation results as a table in azureml?

I recently started using the azureml for automated experiments and logging in the Azure Machine Learning Studio.

In an experiment, I'd like to store the results from a GridSearchCV in a table.

run.log_table(name='Gridsearch results', value=search.cv_results_)

According to the documentation, value should be a dictionary. In my case, it looks as the dictionary at the bottom of this question. However, I get the following error:

Value of type <class 'list'> is not supported, supported types include [[<class 'float'>, <class 'str'>, <class 'bool'>, <class 'NoneType'>, <class 'int'>]]

Even transforming it to a format similar to the one given in the documentation using

run.log_table(name='Gridsearch results', value=pd.DataFrame(search.cv_results_).to_dict(orient="list"))

yields the same error. Any ideas?

{'mean_fit_time': array([ 4.44100904,  0.01762947, 12.24124289,  0.01914111]),
 'std_fit_time': array([1.66466241e+00, 8.54067066e-04, 2.84891905e+00, 1.26775086e-03]), 
 'mean_score_time': array([0.00462735, 0.00775236, 0.00512046, 0.00737476]),
 'std_score_time': array([0.00048182, 0.0004347 , 0.00092224, 0.00069597]),
 'param_C': masked_array(data=[1, 1, 10, 10], mask=[False, False, False, False], fill_value='?', dtype=object),
 'param_kernel': masked_array(data=['linear', 'rbf', 'linear', 'rbf'],
             mask=[False, False, False, False],
       fill_value='?',
            dtype=object), 'params': [{'C': 1, 'kernel': 'linear'}, {'C': 1, 'kernel': 'rbf'}, {'C': 10, 'kernel': 'linear'}, {'C': 10, 'kernel': 'rbf'}], 'split0_test_score': array([0.81111111, 0.54444444, 0.81111111, 0.62222222]), 'split1_test_score': array([0.75555556, 0.61111111, 0.75555556, 0.64444444]), 'split2_test_score': array([0.80898876, 0.75280899, 0.80898876, 0.7752809 ]), 'split3_test_score': array([0.7752809 , 0.65168539, 0.7752809 , 0.69662921]), 'split4_test_score': array([0.78651685, 0.69662921, 0.78651685, 0.76404494]), 'split5_test_score': array([0.71910112, 0.68539326, 0.71910112, 0.70786517]), 'split6_test_score': array([0.79775281, 0.74157303, 0.79775281, 0.7752809 ]), 'split7_test_score': array([0.78651685, 0.6741573 , 0.78651685, 0.75280899]), 'mean_test_score': array([0.780103  , 0.66972534, 0.780103  , 0.7173221 ]), 'std_test_score': array([0.02858483, 0.06374784, 0.02858483, 0.0559392 ]), 'rank_test_score': array([1, 4, 1, 3])}

Upvotes: 0

Views: 336

Answers (2)

Matt Najarian
Matt Najarian

Reputation: 181

You can log the results in JSON format using a simple log function. I noticed that a simple dictionary should be passed to the log_table. So if we want to log a Pandas.DataFrame, we should convert it to dict using 'list' parameter:

run.log_table("calendar", df.to_dict('list'))

which yields a dictionary like below:

{
  'Name': ['My BD', 'Dr. Appointment', 'Cake'],
  'Time': ['10AM', '11AM', '3PM'],
  'Location': ['Home', 'Clinic', 'Bakery']
}

Upvotes: 0

nils.hahn
nils.hahn

Reputation: 31

I assume the issue lies with the masked_arrays 'param_C' and 'param_kernel'. I dont think azureml supports those, since the log_table functions save the information in a two-dimensional arrays, so it can be properly displayed.

Upvotes: 2

Related Questions