Reputation: 18564
I'm trying to insert some import lines into a python source file, but i would ideally like to place them right after the initial docstring. Let's say I load the file into the lines variable like this:
lines = open('filename.py').readlines()
How to find the line number, where the docstring ends?
Upvotes: 2
Views: 1290
Reputation: 31908
This is a function based on Brian's brilliant answer you can use to split a file into docstring and code:
def split_docstring_and_code(infile):
import tokenize
insert_index = None
f = open(infile)
for tok, text, (srow, scol), (erow,ecol), l in tokenize.generate_tokens(f.readline):
if tok == tokenize.COMMENT:
continue
elif tok == tokenize.STRING:
insert_index = erow, ecol
break
else:
break # No docstring found
lines = open(infile).readlines()
if insert_index is not None:
erow = insert_index[0]
return "".join(lines[:erow]), "".join(lines[erow:])
else:
return "", "".join(lines)
It assumes that the line that ends the docstring does not contain additional code past the closing delimiter of the string.
Upvotes: 0
Reputation: 119211
Rather than using a regex, or relying on specific formatting you could use python's tokenize module.
import tokenize
f=open(filename)
insert_index = None
for tok, text, (srow, scol), (erow,ecol), l in tokenize.generate_tokens(f.readline):
if tok == tokenize.COMMENT:
continue
elif tok == tokenize.STRING:
insert_index = erow, ecol
break
else:
break # No docstring found
This way you can even handle pathological cases like:
# Comment
# """Not the real docstring"""
' this is the module\'s \
docstring, containing:\
""" and having code on the same line following it:'; this_is_code=42
excactly as python would handle them.
Upvotes: 11
Reputation: 200756
If you're using the standard docstring format, you can do something like this:
count = 0
for line in lines:
if line.startswith ('"""'):
count += 1
if count < 3:
# Before or during end of the docstring
continue
# Line is after docstring
Might need some adaptation for files with no docstrings, but if your files are formatted consistently it should be easy enough.
Upvotes: 2