NathanaëlBeau
NathanaëlBeau

Reputation: 1152

Extract multiple variable names in docstrings

I'm trying to find all the variable names in docstrings from Python. For instance, the form of the docstring is following:

Scans through a string for substrings matched some patterns (first-subgroups only).

Args:
    text: A string to be scanned.
    patterns: Arbitrary number of regex patterns.

Returns:
    When only one pattern is given, returns a string (None if no match found).
    When more than one pattern are given, returns a list of strings ([] if no match found).

I would like to extract both text and patterns with regex.

I tried this code to find all element after break lines which are ending with : thanks to this particular regular expression:

string = """Args:
    text: A string to be scanned.
    patterns: Arbitrary number of regex patterns."""
print(re.findall('Args:[\r\n]+(.+?):', string))

But this regular expression captures nothing, what am I doing wrong?

Upvotes: 1

Views: 332

Answers (1)

Vishnudev Krishnadas
Vishnudev Krishnadas

Reputation: 10960

I would use docstring-parser rather than re-inventing the wheel. It supports Google, ReST, and Numpydoc style docstrings.

from docstring_parser import parse

s = """
Scans through a string for substrings matched some patterns (first-subgroups only).

Args:
    text: A string to be scanned.
    patterns: Arbitrary number of regex patterns.

Returns:
    When only one pattern is given, returns a string (None if no match found).
    When more than one pattern are given, returns a list of strings ([] if no match found).
"""
doc_str = parse(s)
print([param.arg_name for param in doc_str.params])

Output

['text', 'patterns']

Upvotes: 3

Related Questions