Reputation: 237
The problem: I have a back-end process that at some point he collect and build a big tar file. This tar receive few directories and an exclude files. the process can take up to few minutes and i want to report in my front-end process (GUI) about the progress of the taring process (This is a big issue for a user that press download button and it seems like nothing is happening...).
i know i can use -v -R in the tar command and count files and size progress but i am looking for some kind of tar pre-run mode / dry run to help me evaluate either the expected number of files or the expected tar size.
the command I am using: tar -jcf 'FILE.tgz' 'exclude_files' 'include_dirs_and_files'
10x for everyone who is willing to assist.
Upvotes: 9
Views: 18061
Reputation: 121
You can use a specific trick of tar.
If you output to /dev/null
, it doesn't read the file content, so:
root@grigio:/bidone# time tar cpf /dev/null --totals Amerighuccio/
Total bytes written: 26924871680 (26GiB, 11GiB/s)
real 0m2,320s
user 0m0,914s
sys 0m1,401s
Upvotes: 1
Reputation: 33439
Why don't you run a
DIRS=("./test-dir" "./other-dir-to-test")
find ${DIRS[@]} -type f | wc -l
beforehand. This gets all the files (-type f
) one per line and counts the number of files. DIRS is an array in bash, so you can store the folders in a variable
If you want to know the size of all the stored files, you can use du
DIRS=("./test-dir" "./other-dir-to-test")
du -c -d 0 ${DIRS[@]} | tail -1 | awk -F ' ' '{print $1}'
This prints the disk usage with du
, calculates a grand total (-c
flag), gets the last line (example 4378921 total
), and uses just the first column with awk
Upvotes: 1
Reputation: 11171
You can pipe the output to the wc
tool instead of actually making a file.
With file listing (verbose):
[git@server]$ tar czvf - ./test-dir | wc -c
./test-dir/
./test-dir/test.pdf
./test-dir/test2.pdf
2734080
Without:
[git@server]$ tar czf - ./test-dir | wc -c
2734080
Upvotes: 17