Tomek
Tomek

Reputation: 641

Bash: writing to and appending to file using > operator

In Bash script, if I do echo "abc" > file.txt or echo "abc" >> file.tx am I guaranteed that when the next line of the script is executed, "abc" is present in the file.txt?

clarification: What I had in mind is whether I can be sure that after I do echo "some text" > file.txt and proceed to the next line in my script and call some other process, also from my script, that reads file.txt, will that process read "some text" from the file? Also, once I write to the file like that and terminate the script, does script termination perform the flush?

Upvotes: 2

Views: 568

Answers (2)

Luis Colorado
Luis Colorado

Reputation: 12668

The immediate answer is yes. If the process you have launched with output redirected has finished (for that, the shell just does a wait(2) system call) the complete process image has been eliminated from kernel internal tables and all file descriptors are closed (so it cannot do any more system calls).

In general, you cannot warrantee that a process is doing buffering in its user space or not. The kernel warrants that, once a write(2) system call has been executed, the file contents seen by all the system's processes has that data included. To achieve this, the kernel just blocks the inode image in the kernel memory to allow the process making the system call to write the in kernel allocated buffers and not be disturbed by other kernel activity that can happen at the same time. This grants exclusive access to the file contents when a process is read(2) ing or write(2) ing. And the inode is blocked for the whole system call (to make the call atomic). After it, all the file contents appear updated to other processes.

The inode locking while the process is writing is something that warrants that the write system call will be made completely before other things can happen, like:

  • some other process is trying to write the same file.
  • Some other process unlinks the open file (all processes with access to this file before unlinking will continue to have access to it, even if not present in the filesystem) Even if some other process creates a file with the same name, all the write will go to the old, opened file.
  • Some other process tries to read (it'll read the whole data written or not at all, but will not do a partial read)
  • Some other process changes permissions on the file (the permissions to write a file are only checked at open(2) time).
  • Some other process truncates the file. This will be serialized in time, to make the write to happen before or after the system call (but not in the middle).
  • Some other process tries to unmount the filesystem to which this file belongs to.

All these possibilities (and more) have to be contempled in kernel code to warrant the data is consistent.

As the process you are executing is finished, the answer is that all data written will be visible to other processes (it has been finshed, so it cannot make new (neither after close(2), nor after write(2)) even if no data has been flushed to disk.


On the other side, the example you used doesn't fully apply in the explanation up to here: echo is normally an internal shell command, that is executed by the shell in it's user space. As such, this could make things to happen differently, as the process that makes the system calls is the shell, and it has not exit(2) before the command. But even in that case, the shell can spawn a new process just to execute this redirected command (instead of calling exec(2) to execute a different command), and it exit(2) s after executing it, so all file descriptors will be closed also. This makes the example applicable here also.

In the case the shell doesn't spawn a new process for the command (there are also cases for this, at least in bash(1)) the behaviour is to appear as if a new process has been spawn, so the shell has the resposibility of properly close(2) the file descriptor, either.

Upvotes: 1

rici
rici

Reputation: 241761

Once the command which includes a redirection to file.txt terminates, anything written to stdout is present in file.txt, and will be seen by a subsequent process which reads file.txt.

But there are some caveats:

  1. It's entirely possible for another simultaneously-executing process to delete or overwrite the file.

  2. Unix/Linux does not guarantee that data written to a file will be committed to permanent storage immediately when the file is closed. [Note 1] So if the machine on which the file is stored crashes and is rebooted between writing and reading, it is possible that the reading task will not see data written before the crash.

  3. If the writing task terminates abnormally, it is possible that it will not have flushed its output buffers to stdout. So if the task crashes, it is possible that nothing will have been written to the file.

In summary, it would be better to say that data written to a file in the course of a bash command (whether named directly or via a redirect) will be visible to subsequent commands as long as the first task terminated normally, no other running process is manipulating the same file, and the host does not crash before the subsequent command is executed.


Notes

  1. Normally, it doesn't matter that the file is not immediately committed to permanent storage, because the OS must act as though pending modifications were present in the file. Furthermore, during normal shutdown, the OS will commit all pending filesystem modifications. However, if the host machine crashes, or there is some unrecoverable filesystem error on reboot, then it is possible that some filesystem modifications will be lost.

    All of the above applies to the machine on which the file is stored, which might be different from the machine from which the file was written and read in the case that the file is being accessed via a network filesystem.

Upvotes: 4

Related Questions