tuxx
tuxx

Reputation: 503

Python subprocess.Popen with quotes and backslash

I want to sort a tab separated file through a Python script by calling 'sort' command. If I use this:

subprocess.Popen(["sort", r"-t$'t'", "-k1,2", "input", "-o", "output"]).wait()

I get this error:

sort: multi-character tab `$\'t\''

If I use shell=True:

subprocess.Popen(["sort", r"-t$'t'", "-k1,2", "input", "-o", "output"], shell=True).wait()

The process just hangs.

I would prefer using the first method, without shell=True. Any suggestions?

EDIT: The file is huge.

Upvotes: 1

Views: 1710

Answers (2)

chepner
chepner

Reputation: 531075

Python can create a string with a tab; $'\t' is only necessary when you are working directly in the shell.

subprocess.Popen(["sort", "-t\t", "-k1,2", "input", "-o", "output"]).wait()

Upvotes: 2

jsbueno
jsbueno

Reputation: 110271

subprocess.call(r"sort -t\t -k1,2 input -o output")

Looks cleaner - call is a higher level function on the subprocess module than "Popen" - and would make your code simpler to read.

Than, probably, while calling an external "sort" may have certain facilities for large files (> the ammout of avaliable memory) - unless you are dealign with those, you are probabley making it wrong.

Unlike shell scripts, Python is self-contained in the sense it can perform most tasks with your data internally instead of passing data through external simple posix programs.

For sorting your file named "input" and haveing the results ready to use in memory, just do:

# read the data into a list, one line per item:
data = open("input", "rt").readlines()
# sort it, splitting the line on tab characters and taking the first two as key:
data.sort(key=lambda line: line.split("\t")[:2]

# and "data" contains a sorted list of your lines

Upvotes: 0

Related Questions