Wang Tim
Wang Tim

Reputation: 117

Running git command in Python script results in syntax error

When I run

git log  --pretty=tformat: --numstat | awk '{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }'

in Linux terminal, the output is correct:

added lines: 23322, removed lines: 8536, total lines: 14786

Since I don't want to remember such a complex command, I write a Python script to do the same thing:

import os
GitCommand = 'git log  --pretty=tformat: --numstat | awk "{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }"'
report = os.system(GitCommand)

But when I run it, Git reports syntax error:

awk: cmd. line:1: { add += $1; subs += $2; loc += $1 - $2 } END         { printf "added lines: %s, removed lines: %s, total lines: %s
awk: cmd. line:1:                                                                ^ unterminated string
awk: cmd. line:1: { add += $1; subs += $2; loc += $1 - $2 } END         { printf "added lines: %s, removed lines: %s, total lines: %s
awk: cmd. line:1:                                                                ^ syntax error

I have also tried using subprocess, and the output is similar. The problem probably lies in coding of the command string, especially quotation marks, but I don't know how to fix it.

Upvotes: 2

Views: 686

Answers (3)

frmbelz
frmbelz

Reputation: 2543

Doing the same with Python subprocess which was intended to replace os.system.

The first way using shell=True according to Python documentation could have a vulnerability of shell injection.

import subprocess

results = subprocess.check_output("git log --pretty=tformat: --numstat | awk '{ add += $1; subs += $2; loc += $1 - $2} END { print \"added lines: \"add\", removed lines: \"subs\", total lines: \"loc }'", stdin=subprocess.PIPE, shell=True)

print(results.decode('utf-8'))

The second way with shell=False

cmd1 = ['git', 'log', '--pretty=tformat:', '--numstat']
cmd2 = ['awk', '{ add += $1; subs += $2; loc += $1 - $2} END { print \"added lines: \"add\", removed lines: \"subs\", total lines: \"loc }']

p1 = subprocess.Popen(cmd1, stdout=subprocess.PIPE, shell=False)
p2 = subprocess.Popen(cmd2, stdin=p1.stdout, stdout=subprocess.PIPE, shell=False)
results = p2.communicate()[0].decode()

print(results)

Documentation though recommends to use subprocess.run wherever possible, so another option:

cmd1 = ['git', 'log', '--pretty=tformat:', '--numstat']
cmd2 = ['awk', '{ add += $1; subs += $2; loc += $1 - $2} END { print \"added lines: \"add\", removed lines: \"subs\", total lines: \"loc }']

p1 = subprocess.run(cmd1, stdout=subprocess.PIPE, shell=False)
p2 = subprocess.run(cmd2, input=p1.stdout, stdout=subprocess.PIPE, shell=False)

print(p2.stdout.decode('utf-8'))

shell=False is a default so could be dropped. Now I can run git commands from Django web framework by pressing of a button and outputting results back.

Upvotes: 0

slartar
slartar

Reputation: 76

Python script

(This section directly answers the question, meaning getting a Python script that accomplishes what the questioner wants to accomplish. However, it may be the case that a shell script is more appropriate; see the "Shell script" section for more on this.)

I have made a Python script that accomplishes what you want. I made this script by taking your script, and making two modifications:

  • I changed the intended double quotes around awk's only parameter to be escaped single quotes \'.
  • I changed the literal newline \n to be an escaped version of "\n", namely \\n.

Here is an example of output showing the modified script working:

$ git log  --pretty=tformat: --numstat | awk '{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }'
added lines: 5, removed lines: 1, total lines: 4

$ cat script.py
import os
GitCommand = 'git log  --pretty=tformat: --numstat | awk \'{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: %s, total lines: %s\\n", add, subs, loc }\''
report = os.system(GitCommand)

$ python3 script.py
added lines: 5, removed lines: 1, total lines: 4

The following shows a diff of the Python script, from the question's version to this answer's working version:

$ git diff head~ head --word-diff-regex=. script.py
diff --git a/script.py b/script.py
[...]
--- a/script.py
+++ b/script.py
@@ -1,3 +1,3 @@
import os
GitCommand = 'git log  --pretty=tformat: --numstat | awk [-"-]{+\'+}{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: % , total lines: %s\{+\+}n", add, subs, loc }[-"-]{+\'+}'
report = os.system(GitCommand)

Shell script

As mentioned elsewhere in this question, having a shell script file may be the most appropriate way to simply and repeatedly invoke any given shell command that, for whatever reason, is not desired to be stored in something like ~/.bashrc, ~/.bash_profile, or similar.

Specifically for this question, here's an example:

$ git log  --pretty=tformat: --numstat | awk '{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }'
added lines: 5, removed lines: 1, total lines: 4

$ cat ./total-lines.sh
git log --pretty=tformat: --numstat | awk '{ add += $1; subs += $2; loc += $1 - $2 } END { printf "added lines: %s, removed lines: %s, total lines: %s\n", add, subs, loc }'

$ ./total-lines.sh
added lines: 5, removed lines: 1, total lines: 4

My Environment

$ systeminfo | grep --extended-regexp --regexp="^OS (Name|Version)"
OS Name:                   Microsoft Windows 10 Pro
OS Version:                10.0.19043 N/A Build 19043

$ bash --version | head --lines=1
GNU bash, version 4.4.23(1)-release (x86_64-pc-msys)

$ git --version
git version 2.33.0.windows.2

$ python3 --version
Python 3.9.7

$ awk --version | head --lines=1
GNU Awk 5.0.0, API: 2.0 (GNU MPFR 4.1.0, GNU MP 6.2.1)

Upvotes: 2

Code-Apprentice
Code-Apprentice

Reputation: 83527

Using Python here adds an unnecessary layer of complexity. The easiest solution here is to create a file my_fancy_git_command.sh and copy the bash code into it. Now you can run the entire command by using the name of the script.

If you want to be able to run this script from multiple directories, I suggest creating a bin directory in your user folder. Then add $HOME/bin to PATH in .bashrc. Be sure to close your terminals and open a new one to see the change to PATH reflected in the current environment.

Upvotes: 2

Related Questions