leonardltk1
leonardltk1

Reputation: 377

Removing progress bar from program output redirected into log file

I was running a program, and it will output the progress bar. I did it like this

python train.py |& tee train.log

The train.log looks like the following.

This is line 1

Training ...

This is line 2

...
[000] valid: 100%|█████████████████████████████████████████████████████████████▉| 2630/2631 [15:24<00:00,  2.98 track/s]
[000] valid: 100%|██████████████████████████████████████████████████████████████| 2631/2631 [15:25<00:00,  3.02 track/s]                                                                                                              
Epoch 000: train=0.11940351 valid=0.10640465 best=0.1064 duration=0.79 days

This is line 3 ...

[001] valid: 100%|█████████████████████████████████████████████████████████████▉| 2629/2631 [15:11<00:00,  2.90 
[001] valid: 100%|█████████████████████████████████████████████████████████████▉| 2630/2631 [15:11<00:00,  2.89 
[001] valid: 100%|██████████████████████████████████████████████████████████████| 2631/2631 [15:12<00:00,  2.88                                                                                                   
Epoch 001: train=0.10971066 valid=0.09931737 best=0.0993 duration=0.79 days

On the terminal, they are supposed to be viewed as replacing itself, hence in the log file, there are alot of repetitions. So when I did wc -l train.log, it only returned 3 lines. However when I opened this 5MB text file in the text editor, there are like 20000 lines.

My objective is to only get these details:

Epoch 000: train=0.11940351 valid=0.10640465 best=0.1064 duration=0.79 days    
Epoch 001: train=0.10971066 valid=0.09931737 best=0.0993 duration=0.79 days

My questions are:

  1. How do I, without stopping my current training progress, extract my desired details from the suppposedly "3" lines of train.log? Keep in mind that this training will be continuously done for 10 more epochs, so I don't want to open the whole junk of progress bar in the editor.

  2. In the future, how should I store my log file (instead of calling python train.py |& tee train.log) such that while I can see the progress bar in the terminal, I only keep the important information in a text file?

Edit 1 : Here's a link to the file train.log

Upvotes: 3

Views: 1587

Answers (2)

mkrieger1
mkrieger1

Reputation: 23140

The progress bars are probably written to stderr, which you send to tee together with stdout by using |&.

To write only stdout to the file, use the normal pipe | instead.


The progress bar was generated by writing one line and then a carriage return character (\r) but no newline character (\n). To fix that and to be able to process the file further, you can use for example sed 's/\r/\n/g'.

The following works with the file linked in the question:

$ sed 's/\r/\n/g' train.log | grep Epoch
Epoch 000: train=0.11940351 valid=0.10640465 best=0.1064 duration=0.79 days

Upvotes: 1

leonardltk1
leonardltk1

Reputation: 377

Ok, I solved it already.

According to this question,

You make a progress bar by doing echo -ne "your text \r" > log.file.

So because some editor that i used (Notepad, sublime text 3) recognise \r as a line breaker, you see them as seperate line, but in actual fact they are stored in single line.

So to reverse engineer it, you can make them into actual line breakers sed -i "s,\r,\n,g" train.log, and the grep accoringly.

Anyhoo, thanks @mkrieger1 for helping me out anyway !

Upvotes: 0

Related Questions