[PYTHON] Complement argparse from docstrings

Introduction

When creating a command line tool with python, click is convenient as an argument parser. However, when trying to do complicated things, I feel that argparse is easier to use. On the other hand, if you use argparse, you need to give description and help to subcommands and arguments. In most cases, the information is duplicated with the docstring, which is troublesome twice. So, let's think about how to pass information to argparse using docstring.

It is assumed that docstrings are written according to Google Style Guide.

Program-wide description

Description of the command body, that is, the description argument of argparse.ArgumentParser. This should be in the docstring of the module that has the main function. Therefore, the creation of ArgumentParser is

parser = argparse.ArgumentParser(
  description=__doc__, formatter_class=argparse.RawTextHelpFormatter)

If so, it looks good. Note that argparse.RawTextHelpFormatter is passed to formatter_class. This is to prevent line breaks and spaces from being deleted from the description and help that are set after this.

Subcommand description

It is assumed that the subcommand is bundled with the function that executes it. That is, at least

a1_cmd = subparsers.add_parser("a1")
a1_cmd.set_defaults(cmd=a1)

I think it is. If so, given a function that implements the subcommand, here ʻa1`,

a1_cmd = subparsers.add_parser(
  a1.__name__, help=a1.__doc__, description=a1.__doc__)
a1_cmd.set_defaults(cmd=a1)

You should be able to do something like that. However, as it is, a long sentence including argument information will be set in help and description, so A headline function that gets only the first line of the docstring, Until the explanation of the arguments (until ʻArgs: Returns:Raises: Yields: `appears in Google style) Prepare a description function to get

a1_cmd = subparsers.add_parser(
  a1.__name__,
  help=headline(a1.__doc__),
  description=description(a1.__doc__))
a1_cmd.set_defaults(cmd=a1)

Let's say. (I don't think there are Yields, but for the time being) The headline and description functions look like this:

def headline(text):
    """ Returns a head line of a given text.
    """
    return text.split("\n")[0]

keywords =("Args:", "Returns:", "Raises:", "Yields:")
def checker(v):
    """ Check a given value not starts with keywords.
    """
    for k in keywords:
        if k in v:
            return False
    return True

def description(text):
    """ Returns a text before `Args:`, `Returns:`, `Raises:`, or `Yields:`.
    """
    lines = list(itertools.takewhile(checker, text.split("\n")))
    if len(lines) < 2:
        return lines[0]
    return "{0}\n\n{1}".format(lines[0], textwrap.dedent("\n".join(lines[2:])))

Help for each argument

ʻA1_cmd.add_argument (“name”, help = “description of this argument.”) etc. help argument. These are probably written in the docstring of the function corresponding to the subcommand. In Google style, it should be written in a form close to the dictionary format after ʻArgs: . Therefore, get the argument explanation column from docstring and get it. You can pass it as a help argument when the corresponding argument is added with add_argument.

In addition to the headline and description above, we need to parse the docstring in more detail, so Prepare functions collectively.

_KEYWORDS_ARGS = ("Args:",)
_KEYWORDS_OTHERS = ("Returns:", "Raises:", "Yields:")
_KEYWORDS = _KEYWORDS_ARGS + _KEYWORDS_OTHERS

def checker(keywords):
    """Generate a checker which tests a given value not starts with keywords."""
    def _(v):
        """Check a given value matches to keywords."""
        for k in keywords:
            if k in v:
                return False
        return True
    return _

def parse_doc(doc):
    """Parse a docstring.
    Parse a docstring and extract three components; headline, description,
    and map of arguments to help texts.
    Args:
      doc: docstring.
    Returns:
      a dictionary.
    """
    lines = doc.split("\n")
    descriptions = list(itertools.takewhile(checker(_KEYWORDS), lines))

    if len(descriptions) < 3:
        description = lines[0]
    else:
        description = "{0}\n\n{1}".format(
            lines[0], textwrap.dedent("\n".join(descriptions[2:])))

    args = list(itertools.takewhile(
        _checker(_KEYWORDS_OTHERS),
        itertools.dropwhile(_checker(_KEYWORDS_ARGS), lines)))
    argmap = {}
    if len(args) > 1:
        for pair in args[1:]:
            kv = [v.strip() for v in pair.split(":")]
            if len(kv) >= 2:
                argmap[kv[0]] = kv[1]

    return dict(headline=descriptions[0], description=description, args=argmap)

Passing a docstring to this parse_doc will return a headline, description, and args dictionary. The former two are

a1_doc = parse_doc(a1)
a1_cmd = subparsers.add_parser(
  a1.__name__,
  help=a1_doc["headline"],
  description=a1_doc["description"])
a1_cmd.set_defaults(cmd=a1)

The argument dictionary can be used like

a1_cmd.add_argument("name", help=a1_doc["args"]["name"])

It can be used like.

Summary and library

By doing the above, you can complete from docstrings without writing description and help every time. However, it is troublesome to prepare these every time you write a command line application, so We have prepared the library dsargparse. Since the above contents are realized as a wrapper of argparse, it is not necessary to write almost anything from the user side. See sample.

Recommended Posts

Complement argparse from docstrings
Porting from argparse to hydra
argparse part 1
argparse note
python argparse template
I tried using argparse
Complement argparse from docstrings
Docstrings