Reputation: 960
I'm running some code to test if I can run the OpenCV library on a Condor system. It works fine when running python testje.py
in the console. When submitting it on the condor system I receive the following error message:
Traceback (most recent call last):
File "/usr/data/condor/execute/dir_323648/condor_exec.exe", line 24, in <module>
kp, des = sift.detectAndCompute(img, None)
cv2.error: OpenCV(4.5.3) /tmp/pip-req-build-3umofm98/opencv/modules/core/src/alloc.cpp:73: error: (-4:Insufficient memory) Failed to allocate 48385936 bytes in function 'OutOfMemoryError'
So it failed to allocate 48MB of RAM?
When looking at the .log file:
Partitionable Resources : Usage Request Allocated
Cpus : 1 1
Disk (KB) : 192786 768000 1557164
Gpus (Average) : 0 0
Memory (MB) : 0 500 512
I requested 500MB of RAM, there should be plenty. Why does it crash?
All code (to be complete)
testje.py:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import numpy as np
import pickle
import sys
import cv2
print('Starting')
# setup cv2
sift = cv2.SIFT_create()
img = cv2.imread("0.jpg", cv2.IMREAD_GRAYSCALE)
print(img)
print('start calc')
# calc cv2
kp, des = sift.detectAndCompute(img, None)
# calc np
norms = np.linalg.norm(des, axis=1)
# calc normal? python
index = []
for p in kp:
temp = (p.pt, p.size, p.angle, p.response, p.octave, p.class_id)
index.append(temp)
#store using pickle
with open('./random_dat.pickle', 'wb') as handle:
pickle.dump((123456, index, des, norms), handle)
print("finished")
Note:
OpenCV is imported by using pip install ---target=my\python\dir\site-packages\ opencv-python
so that condor can transfer the library and import it as a local library.
Condor input:
#Normal execution
Universe = vanilla
#I need just one CPU (which is the default)
RequestCpus = 1
#No GPU
RequestGPUs = 0
#I need disk spqce KB
RequestDisk = 750MB
#I need 2 GBytes of RAM (resident memory)
RequestMemory = 500MB
#It will not run longer than 1 day
+RequestWalltime = 150
#Transfer input files in cur_dir
transfer_input_files = 0.jpg, site-packages/
#retrieve data
should_transfer_files = YES
when_to_transfer_output = ON_EXIT
#I'm a nice person, I think...
NiceUser = true
#Mail me only if something is wrong
Notification = Always
# The job will 'cd' to this directory before starting, be sure you can _write_ here.
initialdir = /users/students/r0xxxxxx/Documents/testing_condor/
# This is the executable or script I want to run
executable = /users/students/r0xxxxxx/Documents/testing_condor/testje.py
#Output of condors handling of the jobs, will be in 'initialdir'
Log = condor_bin.log
#Standard output of the 'executable', in 'initialdir'
Output = condor_bin.out
#Standard error of the 'executable', in 'initialdir'
Error = condor_bin.err
# Start just 1 instance of the job
Queue 1
Full condor log:
...
000 (3xx.xxx.xxx) 2021-07-19 16:20:57 Job submitted from host: <10.xx.xx.xxx:xxxx?addrs=10.xx.xx.xxx-xxxx&alias=name.xxxx.xxxxxxxx.be&noUDP&sock=schedd_xxxx_xxxx>
...
040 (3xx.xxx.xxx) 2021-07-19 16:21:15 Started transferring input files
Transferring to host: <10.87.24.13:9618?addrs=10.xx.xx.xx-xxxx&alias=other.xxxx.xxxxxx.be&noUDP&sock=slotx_x_xxxxxx_xxxx_xxxx>
...
040 (3xx.xxx.xxx) 2021-07-19 16:21:19 Finished transferring input files
...
001 (3xx.xxx.xxx) 2021-07-19 16:21:20 Job executing on host: <10.xx.xx.xxx:xxxx?addrs=10.xx.xx.xx-xxxx&alias=other.xxxx.xxxxxxxx.be&noUDP&sock=startd_xxxx_xxxx>
...
006 (3xx.xxx.xxx) 2021-07-19 16:21:22 Image size of job updated: 1
0 - MemoryUsage of job (MB)
0 - ResidentSetSize of job (KB)
...
040 (3xx.xxx.xxx) 2021-07-19 16:21:22 Started transferring output files
...
040 (3xx.xxx.xxx) 2021-07-19 16:21:22 Finished transferring output files
...
005 (3xx.xxx.xxx) 2021-07-19 16:21:22 Job terminated.
(1) Normal termination (return value 1)
Usr 0 00:00:01, Sys 0 00:00:00 - Run Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Run Local Usage
Usr 0 00:00:01, Sys 0 00:00:00 - Total Remote Usage
Usr 0 00:00:00, Sys 0 00:00:00 - Total Local Usage
566 - Run Bytes Sent By Job
197400704 - Run Bytes Received By Job
566 - Total Bytes Sent By Job
197400704 - Total Bytes Received By Job
Partitionable Resources : Usage Request Allocated
Cpus : 1 1
Disk (KB) : 192786 768000 1557164
Gpus (Average) : 0 0
Memory (MB) : 0 500 512
Job terminated of its own accord at 2021-07-19T14:21:22Z.
...
python prints:
Starting
[[ 21 83 40 ... 2 36 57]
[ 42 51 27 ... 53 44 28]
[ 60 31 127 ... 46 28 20]
...
[103 80 22 ... 26 58 105]
[ 58 63 47 ... 44 49 66]
[ 48 49 56 ... 64 51 57]]
start calc
Upvotes: 0
Views: 150