Reputation: 136339
I am trying to use the GPU with Theano. I've read this tutorial.
However, I can't get theano to use the GPU and I don't know how to continue.
$ cat /etc/issue
Welcome to openSUSE 12.1 "Asparagus" - Kernel \r (\l).
$ nvidia-smi -L
GPU 0: Tesla C2075 (S/N: 0324111084577)
$ echo $LD_LIBRARY_PATH
/usr/local/cuda-5.0/lib64:[other]:/usr/local/lib:/usr/lib:/usr/local/X11/lib:[other]
$ find /usr/local/ -name cuda_runtime.h
/usr/local/cuda-5.0/include/cuda_runtime.h
$ echo $C_INCLUDE_PATH
/usr/local/cuda-5.0/include/
$ echo $CXX_INCLUDE_PATH
/usr/local/cuda-5.0/include/
$ nvidia-smi -a
NVIDIA: could not open the device file /dev/nvidiactl (Permission denied).
Failed to initialize NVML: Insufficient Permissions
$ echo $PATH
/usr/lib64/mpi/gcc/openmpi/bin:/home/mthoma/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/usr/lib/mit/bin:.:/home/mthoma/bin
$ ls -l /dev/nv*
crw-rw---- 1 root video 195, 0 1. Jul 09:47 /dev/nvidia0
crw-rw---- 1 root video 195, 255 1. Jul 09:47 /dev/nvidiactl
crw-r----- 1 root kmem 10, 144 1. Jul 09:46 /dev/nvram
# nvidia-smi -a
==============NVSMI LOG==============
Timestamp : Wed Jul 30 05:13:52 2014
Driver Version : 304.33
Attached GPUs : 1
GPU 0000:04:00.0
Product Name : Tesla C2075
Display Mode : Enabled
Persistence Mode : Disabled
Driver Model
Current : N/A
Pending : N/A
Serial Number : 0324111084577
GPU UUID : GPU-7ea505ef-ad46-bb24-c440-69da9b300040
VBIOS Version : 70.10.46.00.05
Inforom Version
Image Version : N/A
OEM Object : 1.1
ECC Object : 2.0
Power Management Object : 4.0
PCI
Bus : 0x04
Device : 0x00
Domain : 0x0000
Device Id : 0x109610DE
Bus Id : 0000:04:00.0
Sub System Id : 0x091010DE
GPU Link Info
PCIe Generation
Max : 2
Current : 1
Link Width
Max : 16x
Current : 16x
Fan Speed : 30 %
Performance State : P12
Clocks Throttle Reasons : N/A
Memory Usage
Total : 5375 MB
Used : 39 MB
Free : 5336 MB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 5 %
Ecc Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
Single Bit
Device Memory : 0
Register File : 0
L1 Cache : 0
L2 Cache : 0
Texture Memory : N/A
Total : 0
Double Bit
Device Memory : 0
Register File : 0
L1 Cache : 0
L2 Cache : 0
Texture Memory : N/A
Total : 0
Aggregate
Single Bit
Device Memory : 133276
Register File : 0
L1 Cache : 0
L2 Cache : 0
Texture Memory : N/A
Total : 133276
Double Bit
Device Memory : 203730
Register File : 0
L1 Cache : 0
L2 Cache : 0
Texture Memory : N/A
Total : 203730
Temperature
Gpu : 58 C
Power Readings
Power Management : Supported
Power Draw : 33.83 W
Power Limit : 225.00 W
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 50 MHz
SM : 101 MHz
Memory : 135 MHz
Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : 573 MHz
SM : 1147 MHz
Memory : 1566 MHz
Compute Processes : None
Compiling and executing worked as a super user (tested with cuda/C/0_Simple/simpleMultiGPU
):
# ldconfig /usr/local/cuda-5.0/lib64/
# ./simpleMultiGPU
[simpleMultiGPU] starting...
CUDA-capable device count: 1
Generating input data...
Computing with 1 GPUs...
GPU Processing time: 27.814000 (ms)
Computing with Host CPU...
Comparing GPU and Host CPU results...
GPU sum: 16777296.000000
CPU sum: 16777294.395033
Relative difference: 9.566307E-08
[simpleMultiGPU] test results...
PASSED
> exiting in 3 seconds: 3...2...1...done!
When I try this as normal user, I get:
$ ./simpleMultiGPU
[simpleMultiGPU] starting...
CUDA error at simpleMultiGPU.cu:87 code=38(cudaErrorNoDevice) "cudaGetDeviceCount(&GPU_N)"
CUDA-capable device count: 0
Generating input data...
Floating point exception
How can I get cuda to work with non-super users?
The following code is from "Testing Theano with GPU"
#!/usr/bin/env python
from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time
vlen = 10 * 30 * 768 # 10 x #cores x # threads per core
iters = 1000
rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print f.maker.fgraph.toposort()
t0 = time.time()
for i in xrange(iters):
r = f()
t1 = time.time()
print 'Looping %d times took' % iters, t1 - t0, 'seconds'
print 'Result is', r
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
print 'Used the cpu'
else:
print 'Used the gpu'
The complete error message is much too long to post it here. A longer version is on http://pastebin.com/eT9vbk7M, but I think the relevant part is:
cc1plus: fatal error: cuda_runtime.h: No such file or directory
compilation terminated.
ERROR (theano.sandbox.cuda): Failed to compile cuda_ndarray.cu: ('nvcc return status', 1, 'for cmd', 'nvcc -shared -g -O3 -m64 -Xcompiler -DCUDA_NDARRAY_CUH=bcb411d72e41f81f3deabfc6926d9728,-D NPY_ARRAY_ENSURECOPY=NPY_ENSURECOPY,-D NPY_ARRAY_ALIGNED=NPY_ALIGNED,-D NPY_ARRAY_WRITEABLE=NPY_WRITEABLE,-D NPY_ARRAY_UPDATE_ALL=NPY_UPDATE_ALL,-D NPY_ARRAY_C_CONTIGUOUS=NPY_C_CONTIGUOUS,-D NPY_ARRAY_F_CONTIGUOUS=NPY_F_CONTIGUOUS,-fPIC -Xlinker -rpath,/home/mthoma/.theano/compiledir_Linux-3.1.10-1.16-desktop-x86_64-with-SuSE-12.1-x86_64-x86_64-2.7.2/cuda_ndarray -Xlinker -rpath,/usr/local/cuda-5.0/lib -Xlinker -rpath,/usr/local/cuda-5.0/lib64 -I/usr/local/lib/python2.7/site-packages/Theano-0.6.0rc1-py2.7.egg/theano/sandbox/cuda -I/usr/local/lib/python2.7/site-packages/numpy-1.6.2-py2.7-linux-x86_64.egg/numpy/core/include -I/usr/include/python2.7 -o /home/mthoma/.theano/compiledir_Linux-3.1.10-1.16-desktop-x86_64-with-SuSE-12.1-x86_64-x86_64-2.7.2/cuda_ndarray/cuda_ndarray.so mod.cu -L/usr/local/cuda-5.0/lib -L/usr/local/cuda-5.0/lib64 -L/usr/lib64 -lpython2.7 -lcublas -lcudart')
WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available
The standard stream gives:
['nvcc', '-shared', '-g', '-O3', '-m64', '-Xcompiler', '-DCUDA_NDARRAY_CUH=bcb411d72e41f81f3deabfc6926d9728,-D NPY_ARRAY_ENSURECOPY=NPY_ENSURECOPY,-D NPY_ARRAY_ALIGNED=NPY_ALIGNED,-D NPY_ARRAY_WRITEABLE=NPY_WRITEABLE,-D NPY_ARRAY_UPDATE_ALL=NPY_UPDATE_ALL,-D NPY_ARRAY_C_CONTIGUOUS=NPY_C_CONTIGUOUS,-D NPY_ARRAY_F_CONTIGUOUS=NPY_F_CONTIGUOUS,-fPIC', '-Xlinker', '-rpath,/home/mthoma/.theano/compiledir_Linux-3.1.10-1.16-desktop-x86_64-with-SuSE-12.1-x86_64-x86_64-2.7.2/cuda_ndarray', '-Xlinker', '-rpath,/usr/local/cuda-5.0/lib', '-Xlinker', '-rpath,/usr/local/cuda-5.0/lib64', '-I/usr/local/lib/python2.7/site-packages/Theano-0.6.0rc1-py2.7.egg/theano/sandbox/cuda', '-I/usr/local/lib/python2.7/site-packages/numpy-1.6.2-py2.7-linux-x86_64.egg/numpy/core/include', '-I/usr/include/python2.7', '-o', '/home/mthoma/.theano/compiledir_Linux-3.1.10-1.16-desktop-x86_64-with-SuSE-12.1-x86_64-x86_64-2.7.2/cuda_ndarray/cuda_ndarray.so', 'mod.cu', '-L/usr/local/cuda-5.0/lib', '-L/usr/local/cuda-5.0/lib64', '-L/usr/lib64', '-lpython2.7', '-lcublas', '-lcudart']
[Elemwise{exp,no_inplace}(<TensorType(float32, vector)>)]
Looping 1000 times took 3.25972604752 seconds
Result is [ 1.23178029 1.61879337 1.52278066 ..., 2.20771813 2.29967761
1.62323284]
Used the cpu
$ cat .theanorc
[global]
device = gpu
floatX = float32
[cuda]
root = /usr/local/cuda-5.0
Upvotes: 3
Views: 8603
Reputation: 5071
As some comments told, the problem is the permissio of /dev/nvidia*. As some told this mean during your startup, it don't get initialized correctly. Normally, this is done correctly when the GUI is started. My guess is that you didn't enable it or install it. So you probably have an headless server.
To fix this, just run as root nvidia-smi
. This will detect that it isn't started correctly and will fix it. root have the permission to fix things. Normal user don't have the permission to fix this. That is why it work with root (it get automatically fixed), but not as normal user.
This fix need to be done each time the computer boot. To automatise this, you can create as root this file /etc/init.d/nvidia-gpu-config
with this content:
#!/bin/sh
#
# nvidia-gpu-config Start the correct initialization of nvidia GPU driver.
#
# chkconfig: - 90 90
# description: Init gpu to wanted states
# sudo /sbin/chkconfig --add nvidia-smi
#
case $1 in
'start')
nvidia-smi
;;
esac
Then as root run this command: /sbin/chkconfig --add nvidia-gpu-config
.
UPDATE: This work for OS that use the init system SysV. If your system use the init system systemd, I don't know if it work.
Upvotes: 4
Reputation: 81
Try exporting C_INCLUDE_PATH to cuda toolkit include files on your system, something like:
export C_INCLUDE_PATH=${C_INCLUDE_PATH}:/usr/local/cuda/include
Upvotes: 0