Reputation: 397
I'm currently using Jupyter ipython notebook and the file I am working with has a lot of code. I am just curious as to how many lines of code there exactly are in my file. It is hard to count since I have separated my code into many different blocks.
For anyone who is experienced with jupyter notebook, how do you count how many total lines of code there are in the file?
Thanks!
Edit: I've figured out how to do this, although in a pretty obscure way. Here's how: download the jupyter notebook as a .py file, and then open the .py file in software like Xcode, or whatever IDE you use, and count the lines of code there.
Upvotes: 21
Views: 22623
Reputation: 31
The answer from @Jessime Kirk is really good. But it seems like the ipynb file shouldn't have Chinese character. So I optimized the code as below.
#!/usr/bin/env python
from json import load
from sys import argv
def loc(nb):
with open(nb, encoding='utf-8') as data_file:
cells = load(data_file)['cells']
return sum(len(c['source']) for c in cells if c['cell_type'] == 'code')
def run(ipynb_files):
return sum(loc(nb) for nb in ipynb_files)
if __name__ == '__main__':
print(r"This file can count the code lines number in .ipynb files.")
print(r"usage:python countIpynbLine.py xxx.ipynb")
print(r"example:python countIpynbLine.py .\test_folder\test.ipynb")
print(r"it can also count multiple code.ipynb lines.")
print(r"usage:python countIpynbLine.py code_1.ipynb code_2.ipynb")
print(r"start to count line number")
print(run(argv[1:]))
Upvotes: 3
Reputation: 166
The same can be done from shell if you have a useful jq utility:
jq '.cells[] | select(.cell_type == "code") .source[]' nb1.ipynb nb2.ipynb | wc -l
Also, you can use grep
to filter lines further, e.g. to remove blank lines:
| grep -e ^\"\\\\n\"$ | wc -l
Upvotes: 6
Reputation: 694
This will give you the total number of LOC in one or more notebooks that you pass to the script via the command-line:
#!/usr/bin/env python
from json import load
from sys import argv
def loc(nb):
cells = load(open(nb))['cells']
return sum(len(c['source']) for c in cells if c['cell_type'] == 'code')
def run(ipynb_files):
return sum(loc(nb) for nb in ipynb_files)
if __name__ == '__main__':
print(run(argv[1:]))
So you could do something like $ ./loc.py nb1.ipynb nb2.ipynb
to get results.
Upvotes: 30