Reputation: 902
I'm currently trying to figure out how to correctly close a file descriptor when it points to a remote file and the connection is lost.
I have a simple example program which opens a file descriptor on a sshfs mount folder and start to write to the file.
I'm not able to find how to handle the case when the connection is lost.
void *write_thread(void* arg);
int main()
{
pthread_t thread;
int fd = -1;
if(-1 == (fd = open("/mnt/testfile.txt", O_CREAT | O_RDWR | O_NONBLOCK, S_IRWXU)))
{
fprintf(stderr, "Error oppening file : %m\n");
return EXIT_FAILURE;
}
else
{
if(0 > pthread_create(&thread, NULL, write_thread, &fd))
{
fprintf(stderr, "Error launching thread : %m\n");
return EXIT_FAILURE;
}
fprintf(stdout, "Waiting 10 seconds before closing\n");
sleep(10);
if(0 > close(fd))
{
fprintf(stderr, "Error closing file descriptor: %m\n");
}
}
}
void *write_thread(void* arg)
{
int fd = *(int*)arg;
int ret;
while(1)
{
fprintf(stdout, "Write to file\n", fd);
if(0 > ( ret = write(fd, "Test\n", 5)))
{
fprintf(stderr, "Error writing to file : %m\n");
if(errno == EBADF)
{
if(-1 == close(fd))
{
fprintf(stderr, "Close failed : %m\n");
}
return NULL;
}
}
else if(0 == ret)
{
fprintf(stderr, "Nothing happened\n");
}
else
{
fprintf(stderr, "%d bytes written\n", ret);
}
sleep(1);
}
}
When the connection is lost (i.e. I unplug the ethernet cable between my boards), The close
in the main thread always blocks whether I use the flag O_NONBLOCK or not.
The write call sometimes immediately fails with EBADF error or sometimes continues for a long time before failing.
My problem is that the write call doesn't always fail when the connection is lost so I can't trigger the event into the thread and I also can't trigger it from the main thread because close
blocks forever.
So my question is : How to correctly handle this case in C ?
Upvotes: 1
Views: 2442
Reputation: 902
After some diggin around I found that the SSH mount could be configured to drop the connection and disconnect from server if nothing happens.
Setting ServerAliveInterval X on client side to disconnect if the server is unresponsive after X sec.
Setting ClientAliveCountMax X on server side to disconnect if the client is unresponsive after X sec.
ServerAliveCountMax Y and ClientAliveCountMax Y can also be used in order to retry Y times before dropping the connection.
With this configuration applied, the sshfs mount is automatically removed by Linux when the connection is unresponsive.
With this configuration, the write
call fails with Input/output error
first and then with Transport endpoint is not connected
.
This is enough to detect that the connection is lost and thus cleaning up the mess before exiting.
Upvotes: 0
Reputation: 6058
There is a fundamental problem with remote filesystems, where on the one hand you have to cache things in order for performance remain at a usable level, and on the other hand caching in multiple clients can lead to conflicts that are not seen by the server.
NFS, for example, chooses caching by default and if the cache is dirty, it will simply hang until the connection resumes.
Documentation for sshfs suggests similar behavior.
From grepping sshfs' source code, it seems that it doesn't support O_NONBLOCK
at all.
None of that has anything to do with C.
IMO your best option is to switch to nfs, and mount with e.g. -o soft -o timeo=15 -o retrans=1
.
This could cause data corruption/loss in certain situations when there is a network disconnect, mainly when there are multiple clients or when the client crashes, but it does support O_NONBLOCK
and in any case will return EIO
if the connection is lost while a request is in-flight.
Upvotes: 1
Reputation: 4704
question is: how to correctly handle this case in C?
Simply you can not. File handles are designed to be unified and simple, no matter where they point to. When a device is mounted, and the connection (physical or virtual) to it crashes down, things become tricky even at the command line level.
Upvotes: 2