Reputation: 151
I'm trying to execute an awk command to process some text files inside a python script. The following line will print the last 2 columns of the input file and sort by the second col. This command works:
subprocess.call(["awk",'{print $NF,$(NF-1) | "sort -k 2 -n" }', file2],stdout=f3)
.
Now I would like to cut remove the NF col from the sorted file. I added the following line and it gives me syntax error on "pipe"
subprocess.call(["awk",'{print $NF,$(NF-1) | "sort -k 2 -n" | '$NF="";print'}', file2],stdout=f3)
what am I missing in my syntax?
Upvotes: 2
Views: 366
Reputation: 295659
This doesn't work even without Python being involved anywhere; it's an awk problem, not a Python or subprocess
problem.
If your shell code was:
awk '{print $NF,($NF-1) | "sort -k 2 -n" | $NF=""; print}'
...it would still fail with an awk syntax error on the pipe character:
awk: syntax error at source line 1
context is
{print(NF),$(NF-1) | "sort -k 2 -n" >>> | <<< $NF="";print}
awk: illegal statement at source line 1
By contrast, one could make it work in shell by using a three-process pipeline:
awk '{print $NF, $(NF - 1)}' file2 \
| sort -nk2 \
| awk '{ $NF=""; print }' >file3
...and that works fine in Python too:
p1 = subprocess.Popen(['awk', '{print $NF, $(NF - 1)}', file2],
stdout=subprocess.PIPE)
p2 = subprocess.Popen(['sort', '-nk1'],
stdin=p1.stdout, stdout=subprocess.PIPE)
p3 = subprocess.Popen(['awk', '{ $NF=""; print }'],
stdin=p2.stdout, stdout=open(file3, 'w'))
p1.stdout.close()
p2.stdout.close()
p3.wait()
...though it's a lot more trouble than just doing all your logic in native Python, and not needing awk
or sort
at all:
content = [ line.split()[-2:] for line in open(file1).readlines() ]
content.sort(key=lambda x: x[1])
open(file3, 'w').write('\n'.join([item[0] for item in content]) + '\n')
Upvotes: 3