Reputation: 23
I'm using unix system() calls to gunzip and gzip files. With very large files sometimes (i.e. on the cluster compute node) these get aborted, while other times (i.e. on the login nodes) they go through. Is there some soft limit on the time a system call may take? What else could it be?
Upvotes: 2
Views: 3648
Reputation: 1
Sounds like I'm running into the same intermittent issue indicating a timeout of some kind. My script runs every day. I'm starting to believe GZIP has a timeout.
Details:
I'll simply be working around it with a retry logic and general scripting improvements, but I want the next google-er to know they're not crazy. This is happening to other people!
Upvotes: 0
Reputation: 12866
If it's a Linux system I would recommend using strace to see what's going on and which syscall blocks.
You can even attach strace to already running processes:
# strace -p $PID
Upvotes: 0
Reputation: 17420
I'm using unix system() calls to gunzip and gzip files.
Probably silly question: why not use zlib directly from your application?
And system() isn't a system call. It is a wrapper for fork()/exec()/wait(). Check the system() man page. If it doesn't unblock, it might be that your application interferes somehow with wait() - e.g. do you have a SIGCHLD handler?
Upvotes: 0
Reputation: 56
Almost certainly not a problem with use of system(), but with the operation you're performing. Always check the return value, but even more so, you'll want to see the output of the command you're calling. For non-interactive use, it's often best to write stdout and stderr to log files. One way to do this is to write a wrapper script that checks for the underlying command, logs the commandline, redirects stdout and stderr (and closes stdin if you want to be careful), then execs the commandline. Run this via system() rather than the OS command directly.
My bet is that the failing machines have limited disk space, or are missing either the target file or the actual gzip/gunzip commands.
Upvotes: 0
Reputation: 19034
The calling thread should block indefinitely until the task you initiated with system() completes. If what you are observing is that the call returns and the file operation as not completed it is an indication that the spawned operation failed for some reason.
What does the return value indicate?
Upvotes: 1