Reputation: 548
I have a folder containing lots of files like file_1.gz
to file_250.gz
and increasing.
A zgrep
command which searches through them is like:
zgrep -Pi "\"name\": \"bob\"" ../../LM/DATA/file_*.gz
I want to execute this command in a python subprocess like:
out_file = os.path.join(out_file_path, file_name)
search_command = ['zgrep', '-Pi', '"name": "bob"', '../../LM/DATA/file_*.gz']
process = subprocess.Popen(search_command, stdout=out_file)
The problem is the out_file
is created but it is empty and these errors are raised:
<type 'exceptions.AttributeError'>
'str' object has no attribute 'fileno'
What is the solution?
Upvotes: 0
Views: 1935
Reputation: 414265
There are two issues:
.fileno()
method instead of the filename*
but subprocess does not invoke the shell unless you ask. You could use glob.glob()
to expand the file patterns manually.Example:
#!/usr/bin/env python
import os
from glob import glob
from subprocess import check_call
search_command = ['zgrep', '-Pi', '"name": "bob"']
out_path = os.path.join(out_file_path, file_name)
with open(out_path, 'wb', 0) as out_file:
check_call(search_command + glob('../../LM/DATA/file_*.gz'),
stdout=out_file)
Upvotes: 1
Reputation: 548
My problem consist of two parts:
The second part is related to the files that zgrep tries to search in. when we write a command like zgrep "pattern" path/to/files/*.gz the bash automatically removes the *.gz by all files ends with .gz. When i run the command in a subprocess no one replaced the *.gz by real file, in consequence the error gzip: ../../LM/DATA/file_*.gz: No such file or directory raises. So solved it by:
for file in os.listdir(archive_files_path):
if file.endswith(".gz"):
search_command.append(os.path.join(archive_files_path, file))
Upvotes: 0
Reputation: 2724
You need to pass a file object:
process = subprocess.Popen(search_command, stdout=open(out_file, 'w'))
Citing the manual, emphasis mine:
stdin, stdout and stderr specify the executed program’s standard input, standard output and standard error file handles, respectively. Valid values are PIPE, an existing file descriptor (a positive integer), an existing file object, and None. PIPE indicates that a new pipe to the child should be created. With the default settings of None, no redirection will occur; the child’s file handles will be inherited from the parent.
Combined with LFJ's answer - using the convenience functions is recommended, and you need to use shell=True
to make the wildcard (*
) work:
subprocess.call(' '.join(search_command), stdout=open(out_file, 'w'), shell=True)
Or when you're using shell anyways, you can use the shell redirection as well:
subprocess.call("%s > %s" % (' '.join(search_command), out_file), shell=True)
Upvotes: 1
Reputation: 2253
if your want to execute a shell command and get the output, try to use subprocess.check_output()
. it is very simple, and you could save the output to a file easily.
command_output = subprocess.check_output(your_search_command, shell=True)
with open(out_file, 'a') as f:
f.write(command_output)
Upvotes: 0