Cynthia
Cynthia

Reputation: 397

How to count lines of code in jupyter notebook

I'm currently using Jupyter ipython notebook and the file I am working with has a lot of code. I am just curious as to how many lines of code there exactly are in my file. It is hard to count since I have separated my code into many different blocks.

For anyone who is experienced with jupyter notebook, how do you count how many total lines of code there are in the file?

Thanks!

Edit: I've figured out how to do this, although in a pretty obscure way. Here's how: download the jupyter notebook as a .py file, and then open the .py file in software like Xcode, or whatever IDE you use, and count the lines of code there.

Upvotes: 21

Views: 22623

Answers (3)

常耀耀
常耀耀

Reputation: 31

The answer from @Jessime Kirk is really good. But it seems like the ipynb file shouldn't have Chinese character. So I optimized the code as below.

#!/usr/bin/env python

from json import load
from sys import argv

def loc(nb):
    with open(nb, encoding='utf-8') as data_file:
        cells = load(data_file)['cells']
        return sum(len(c['source']) for c in cells if c['cell_type'] == 'code')

def run(ipynb_files):
    return sum(loc(nb) for nb in ipynb_files)

if __name__ == '__main__':
    print(r"This file can count the code lines number in .ipynb files.")
    print(r"usage:python countIpynbLine.py xxx.ipynb")
    print(r"example:python countIpynbLine.py .\test_folder\test.ipynb")
    print(r"it can also count multiple code.ipynb lines.")
    print(r"usage:python countIpynbLine.py code_1.ipynb code_2.ipynb")
    print(r"start to count line number")
    print(run(argv[1:]))

Upvotes: 3

Kirill Voronin
Kirill Voronin

Reputation: 166

The same can be done from shell if you have a useful jq utility:

jq '.cells[] | select(.cell_type == "code") .source[]' nb1.ipynb nb2.ipynb | wc -l

Also, you can use grep to filter lines further, e.g. to remove blank lines: | grep -e ^\"\\\\n\"$ | wc -l

Upvotes: 6

Jessime Kirk
Jessime Kirk

Reputation: 694

This will give you the total number of LOC in one or more notebooks that you pass to the script via the command-line:

#!/usr/bin/env python

from json import load
from sys import argv

def loc(nb):
    cells = load(open(nb))['cells']
    return sum(len(c['source']) for c in cells if c['cell_type'] == 'code')

def run(ipynb_files):
    return sum(loc(nb) for nb in ipynb_files)

if __name__ == '__main__':
    print(run(argv[1:]))

So you could do something like $ ./loc.py nb1.ipynb nb2.ipynb to get results.

Upvotes: 30

Related Questions