ananvodo
ananvodo

Reputation: 381

How to save a subprocess output as a list of strings for further analysis?

I have this code:

import os
import shlex, subprocess

cmd = "/usr/local/bin/gmx grompp -h"
args = shlex.split(cmd)
proc1 = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
output = proc1.stdout.read()
print(output)

Basically I am using a program called gromacs. As you can see I am combining stdout and stderr so I can then just use stdout.read() and be able to get everything.

However, the print(output) is a mess and it has no format:

b' :-) GROMACS - gmx grompp, 2018.3 (-:\n\n GROMACS is written by:\n Emile Apol Rossen Apostolov Paul Bauer Herman J.C. Berendsen\n Par Bjelkmar Aldert van Buuren Rudi van Drunen Anton Feenstra \n Gerrit Groenhof Aleksei Iupinov Christoph Junghans Anca Hamuraru \n Vincent Hindriksen Dimitrios Karkoulis Peter Kasson Jiri Kraus \n Carsten Kutzner Per Larsson Justin A. Lemkul Viveca Lindahl \n Magnus Lundborg Pieter Meulenhoff Erik Marklund Teemu Murtola \n Szilard Pall Sander Pronk Roland Schulz Alexey Shvetsov \n Michael Shirts Alfons Sijbers Peter Tieleman Teemu Virolainen \n Christian Wennberg Maarten Wolf \n and the project leaders:\n Mark Abraham, Berk Hess, Erik Lindahl, and David van der Spoel\n\nCopyright (c) 1991-2000,

All the \n is were there should be a new line.

What must I do in order to have a list of string were basically each string is a line of the output?

In other words:

output = [" :-) GROMACS - gmx grompp, 2018.3 (-:", "GROMACS is written by:", ..........]

Therefore, I can do things as output[i].find("2018") and other things as well.

when I put:

print(type(output))

I get:

<class 'bytes'>

It is very clear that I must do something else to get what I need but I have no idea what to do. I hope I have made myself clear.

Upvotes: 4

Views: 2711

Answers (1)

martineau
martineau

Reputation: 123463

I think you can do what you want by adding the following line before printing the output:

output = output.decode().splitlines()

Calling decode() will turn the bytes into a python string (str), and the splitlines() turns that into a list-of-strings.

decode() is bytes method that decodes them into a string assuming they've been encoded in utf-8 (by default) — it's not documented in the subprocessing documentation (that I know of).

Upvotes: 5

Related Questions