@docstrings: You have no class.

If you have written any Python code in a shared project recently, you have probably seen a documentation convention like this:

def complex(real=0.0, imag=0.0):
  """Form a complex number.

  @param real: The real part (default 0.0)
  @param imag: The imaginary part (default 0.0)

  @returns: ComplexNumber object.
  if imag == 0.0 and real == 0.0: return complex_zero

This is a good and useful convention for explaining things to future users of the code, if a little verbose. However, you are more likely to have seen class-based code, and there it is not used very well at all. For example:

class CompetitionBasket(FruitBasket):
  """Fruit basket that is entered into a scored competition.

  @param fruits: A dict of fruit names and quantities
  @param scores: A dict of fruit names and scores-per-fruit

  def __init__(self, fruits, scores):
    self.scores = scores
    super(CompetitionBasket, self).__init__(fruits)


  def score(self, relevant_fruits=[]):
  """Return the score of the basket according to the current rules.

  @param relevant_fruits: An array of fruit names corresponding to
  the fruits which are currently under consideration. Defaults to an
  empty list and scores all fruit.

  @returns: Integer score of the basket.

On first glance this looks like the docstring for score follows the same principles. But in actuality this is missing important information, which in a larger class in a complex system would be critical. Both self.fruits and self.scores are critically necessary to the functioning of this method, but neither of them are mentioned. There are advantages to this approach: it is fairly easy to programmatically verify presence of non-empty docstrings for all params and return values a function possesses, and significantly harder to verify presence of docstrings for all non-trivial instance attributes used in a method or all values mutated by side-effects. There are significantly more judgement calls involved in assessing which values need a docstring and which don’t, and it’s plausible that setting the bar for “docstring required” to include these would result in that requirement being more commonly flouted for other methods.

But to consider this and stop is an instance of Goodhart’s Law. It is an argument against mandating them, not an argument against including them wherever possible. For all the reasons we want docstrings (clarity of purpose, maintainability, etc.) we should, wherever possible, include these in the docstring. In some cases, this could result in a docstring 20 lines long; which is clearly a problem. However, in those cases I propose that the main problem is that there is one method which implicitly takes more than a dozen arguments; the object-oriented design has concealed the fact that it is an unwieldy, unmaintainable method and forcing this docstring convention on it brings that fact back into the open.

I would suggest this naming convention:

class Fnord(object):
    def methodName(self, foos):
    """Frobozz the foos according to the Fnord's bazzes.
    @param foos: a list containing Foo instances to frobozz
    @instance_param bazzes: Baz instances containing rules for frobozzing
        for this Fnord
    @class_param quux: Number of times Fnords frobozz each foo


2 thoughts on “@docstrings: You have no class.

  1. I don’t understand this. In practice, docstrings are primarily written and consumed as “user’s guides” or “API documentation” — they describe how to operate an interface from the outside (the explicit arguments) and how it ought to behave, not how it is implemented internally (which ought to be able to change without the user/client noticing or caring).

    Much of the point of OOP is providing these sorts of simplified or abstracted interfaces so that you can make a loosely coupled system whose components have limited, well-defined and stable expectations of one another. These goals create a clear conceptual division between method arguments (which are my contract with the caller) and internal state (which the caller should not have to care about as long as I implement what I claim to implement).

    It sounds like you are thinking in functional programming terms and wanting referential transparency. IMO, we always have the trivial kind of referential transparency that comes from thinking of each method as taking `self` as one of its arguments, and if this is not good enough then maybe the use case needs more functional / less object-oriented design. In good OOP use cases, it tends to be an active advantage to bundle a bunch of state inside the notation `self` and not have to think about all of it when you pass it around.


    • >In practice, docstrings are primarily written and consumed as “user’s guides” or “API documentation”

      I agree. But they’re piss-poor user’s guides when they don’t make clear which parts of the object’s instance variables are critical to the functioning of the method. When debugging, you effectively have no idea what’s going on.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s