Reputation: 887
I have a script that uses python and and wget to download a website, and then perform some tasks with the files. I am using the line os.system("wget -m -w 2 -P " directory)
to call wget, recursively downloading every page in the domain. This works fine, but it has now become necessary to monitor wget for errors downloading a file when it follows a link (Think 404 error trying to access a page).
It is not a matter of getting the exit code, but looking at each 'block' of output that wget supplies.
Is there an easy way to look through the wget output with Python without having to redirect it to a file, and then search the file for an identifying string of text?
Upvotes: 0
Views: 2172
Reputation: 77902
If you only want the exit code then that's what os.system()
returns (warning: it's the standard linux process exit code, so 0
means 'no error' and anything else an error).
If you want more detailed information, you'll have to use the subprocess module (https://docs.python.org/2/library/subprocess.html#module-subprocess) to pipe the subprocess's stderr back to your Python code. Or you could use Python instead of wget - there are quite a few Python-based crawlers available.
Upvotes: 2
Reputation: 285
From what I can tell, os.system
returns the exit code of the command.
So, the following should work:
code = os.system("wget -m -w 2 -P {}".format(directory)}
Upvotes: 0