Reputation: 516
As the title says, I am attempting to use Turi Create on an AWS SageMaker Notebook instance with Python 3.6 (conda_amazonei_mxnet_p36
environment). Even though CUDA 10.0 is installed by default, CUDA 8.0 also comes pre-installed and can be selected using the following commands in the notebook:
!sudo rm /usr/local/cuda
!sudo ln -s /usr/local/cuda-8.0 /usr/local/cuda
I have verified this installation using nvcc --version
and also:
$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery
$ sudo make
$ ./deviceQuery
Next, in my notebook I install Turi Create and the correct version of mxnet for CUDA 8.0:
!pip install turicreate==5.4
!pip uninstall -y mxnet
!pip install mxnet-cu80==1.1.0
Then, I prepare my images and attempt to create a model:
import turicreate as tc
tc.config.set_num_gpus(-1)
images = tc.image_analysis.load_images('images', ignore_failure=True);
data = images.join(annotations_);
train_data, test_data = data.random_split(0.8)
model = tc.object_detector.create(train_data, max_iterations=50)
Which outputs the following when running tc.object_detector.create
Using 'image' as feature column
Using 'annotaion' as annotations column
Downloading https://docs-assets.developer.apple.com/turicreate/models/darknet.params
Download completed: /var/tmp/model_cache/darknet.params
Setting 'batch_size' to 32
Using GPUs to create model (Tesla K80, Tesla K80, Tesla K80, Tesla K80, Tesla K80, Tesla K80, Tesla K80, Tesla K80)
Using default 16 lambda workers.
To maximize the degree of parallelism, add the following code to the beginning of the program:
"turicreate.config.set_runtime_config('TURI_DEFAULT_NUM_PYLAMBDA_WORKERS', 32)"
Note that increasing the degree of parallelism also increases the memory footprint.
---------------------------------------------------------------------------
MXNetError Traceback (most recent call last)
_ctypes/callbacks.c in 'calling callback function'()
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/kvstore.py in updater_handle(key, lhs_handle, rhs_handle, _)
81 lhs = _ndarray_cls(NDArrayHandle(lhs_handle))
82 rhs = _ndarray_cls(NDArrayHandle(rhs_handle))
---> 83 updater(key, lhs, rhs)
84 return updater_handle
85
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/optimizer/optimizer.py in __call__(self, index, grad, weight)
1528 self.sync_state_context(self.states[index], weight.context)
1529 self.states_synced[index] = True
-> 1530 self.optimizer.update_multi_precision(index, weight, grad, self.states[index])
1531
1532 def sync_state_context(self, state, context):
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/optimizer/optimizer.py in update_multi_precision(self, index, weight, grad, state)
553 use_multi_precision = self.multi_precision and weight.dtype == numpy.float16
554 self._update_impl(index, weight, grad, state,
--> 555 multi_precision=use_multi_precision)
556
557 @register
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/optimizer/optimizer.py in _update_impl(self, index, weight, grad, state, multi_precision)
535 if state is not None:
536 sgd_mom_update(weight, grad, state, out=weight,
--> 537 lazy_update=self.lazy_update, lr=lr, wd=wd, **kwargs)
538 else:
539 sgd_update(weight, grad, out=weight, lazy_update=self.lazy_update,
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/ndarray/register.py in sgd_mom_update(weight, grad, mom, lr, momentum, wd, rescale_grad, clip_gradient, out, name, **kwargs)
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/_ctypes/ndarray.py in _imperative_invoke(handle, ndargs, keys, vals, out)
90 c_str_array(keys),
91 c_str_array([str(s) for s in vals]),
---> 92 ctypes.byref(out_stypes)))
93
94 if original_output is not None:
~/anaconda3/envs/amazonei_mxnet_p36/lib/python3.6/site-packages/mxnet/base.py in check_call(ret)
144 """
145 if ret != 0:
--> 146 raise MXNetError(py_str(_LIB.MXGetLastError()))
147
148
MXNetError: Cannot find argument 'lazy_update', Possible Arguments:
----------------
lr : float, required
Learning rate
momentum : float, optional, default=0
The decay rate of momentum estimates at each epoch.
wd : float, optional, default=0
Weight decay augments the objective function with a regularization term that penalizes large weights. The penalty scales with the square of the magnitude of each weight.
rescale_grad : float, optional, default=1
Rescale gradient to grad = rescale_grad*grad.
clip_gradient : float, optional, default=-1
Clip gradient to the range of [-clip_gradient, clip_gradient] If clip_gradient <= 0, gradient clipping is turned off. grad = max(min(grad, clip_gradient), -clip_gradient).
, in operator sgd_mom_update(name="", wd="0.0005", momentum="0.9", clip_gradient="0.025", rescale_grad="1.0", lr="0.001", lazy_update="True")
Interestingly, if I use CUDA 10.0 instead with Turi Create 5.6:
!pip install turicreate==5.6
!pip uninstall -y mxnet
!pip install mxnet-cu100==1.4.0.post0
the notebook still fails, but if I immediately uninstall turicreate
and mxnet-cu100
and try the above steps for CUDA 8.0 again it works without any issues.
Last time it worked, I tried pip freeze > requirements.txt
and then pip install -r requirements.txt
after restarting the instance, but I still get the same error as above (unless I try using CUDA 10.0 first). What's going on here? Any advice is appreciated.
Upvotes: 0
Views: 418
Reputation: 1
Your update from mxnet 1.1.0 to 1.4.0 is the right fix. It looks like the error isn't related to the CUDA version, but rather MXNet itself.
The source code at https://github.com/apache/incubator-mxnet for mxnet 1.1.0 does not have the lazy_update
parameter for the sgd_mom_update
function.
You can observe this by comparing the sgd_mom_update
function call in the optimizer code for mxnet release tag 1.4.0
with the optimizer code for mxnet release tag 1.1.0
These changes were included in mxnet>=1.3.0
, which is why your test was successful on mxnet-cu100==1.4.0.post0
.
Upvotes: 0