RogueBaneling
RogueBaneling

Reputation: 4471

Grep files in between wget recursive downloads

I am trying to recursively download several files using wget -m, and I intend to grep all of the downloaded files to find specific text. Currently, I can wait for wget to fully complete, and then run grep. However, the wget process is time consuming as there are many files and instead I would like to show progress by grep-ing each file as it downloads and printing to stdout, all before the next file downloads.

Example:

download file1
  grep file1 >> output.txt
download file2
  grep file2 >> output.txt
...

Thanks for any advice on how this could be achieved.

Upvotes: 5

Views: 2444

Answers (2)

RogueBaneling
RogueBaneling

Reputation: 4471

Based on Xorg's solution I was able to achieve my desired effect with some minor adjustments:

wget -m -O file.txt http://google.com 2> /dev/null & sleep 1 && tail -f -n1 file.txt | grep pattern

This will print out all lines that contain pattern to stdout, and wget itself will produce no output visible from the terminal. The sleep is included because otherwise file.txt would not be created by the time the tail command executed.

As a note, this command will miss any results that wget downloads within the first second.

Upvotes: 1

repzero
repzero

Reputation: 8402

As c4f4t0r pointed out

 wget -m -O - <wesbites>|grep --color 'pattern'

using grep's color function to highlight the patterns may seem helpful especially when dealing with bulky data output to terminal.

EDIT:

Below is a command line you can use. it creates a file called file and save the output messages from wget.Afterwards it tails the message file.

Using awk to find any lines with "saved" and extract filename, then use grep to pattern from filename.

 wget -m websites  &> file &  tail -f -n1 file|awk -F "\'|\`"  '/saved/{system( ("grep  --colour pattern ") $2)}'

Upvotes: 1

Related Questions