Reputation: 42737
I'm seeing a confusing intermittent error. Sometimes when I call np.save
I'm getting FileNotFoundError
.
Traceback (most recent call last):
File "/home/leo/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", line 536, in save
pickle_kwargs=pickle_kwargs)
File "/home/leo/anaconda3/lib/python3.7/site-packages/numpy/lib/format.py", line 629, in write_array
pickle.dump(array, fp, protocol=2, **pickle_kwargs)
FileNotFoundError: [Errno 2] No such file or directory
During handling of the above exception, another exception occurred:
File "/home/leo/dev/vizproc/embed.py", line 59, in save
np.save(filename, obj)
File "/home/leo/anaconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", line 539, in save
fid.close()
FileNotFoundError: [Errno 2] No such file or directory
The directory it's writing to definitely exists, and the object is a dictionary with a mix of [str]
and np.ndarray
, so it's getting pickled on the way out. Looking at the numpy source, it seems that it's trying and failing to close the file it had opened for writing:
own_fid = False
if hasattr(file, 'read'):
fid = file
else:
file = os_fspath(file)
if not file.endswith('.npy'):
file = file + '.npy'
fid = open(file, "wb")
own_fid = True
if sys.version_info[0] >= 3:
pickle_kwargs = dict(fix_imports=fix_imports)
else:
# Nothing to do on Python 2
pickle_kwargs = None
try:
arr = np.asanyarray(arr)
format.write_array(fid, arr, allow_pickle=allow_pickle,
pickle_kwargs=pickle_kwargs)
finally:
if own_fid:
fid.close() # <=- FileNotFoundError
and inside the format.write_array(...)
call is really just some type checking and then pickle.dump(arr, fid, protocol=2, **pickle_kwargs)
which is also raising FileNotFoundError
.
I'm using Numpy: 1.16.3, Python: 3.7.1 (default, Dec 14 2018, 19:28:38) [GCC 7.3.0] on Ubuntu 18.04.
I'm trying to reason through what kind of race condition could cause this, or why else it might be happening. Is it that the file is getting opened by this process, but then another process erases the file before the writing happens? Seems reasonable, but then this should repro the failure, which it doesn't:
fid = open("testfile", "wb")
os.unlink("testfile")
pickle.dump({'obj':'test'}, fid, protocol=2) # no error
fid.close() # no error
Also, after the error gets raised, there's a zero-byte file on the disk. Any idea what's going on?
Upvotes: 1
Views: 981
Reputation: 42737
I believe the root cause of this was a hardware problem or something deep in the disk system. Frustratingly, the OS wasn't logging any error messages. A bunch of details in this github issue including a simpler repro case, in case anybody wants to dive in. Key points: it only happened on USB-attached disks, and only if the path had a symlink in it.
Upvotes: 1