Best practices in software engineering

Documentation

Make sure you have an environment where you can write Python code and run terminal commands such as we set up with JupyterLab in the last section.

In previous sessions you have learned how to package code into functions and to package functions into modules. Functions and modules let you design, write and package your code so that it is easy to understand and easily reusable. However, to share the code, and allow users to really understand how it works, you need to add documentation.

You can access the documentation for any object using the Python help function or using ? in the interactive Python console. For example, lets look at the documentation for the print function that we have used many times. Go to the Console and run:

In [1]:
print?

it should return something that looks like:

Docstring:
print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

Prints the values to a stream, or to sys.stdout by default.
Optional keyword arguments:
file:  a file-like object (stream); defaults to the current sys.stdout.
sep:   string inserted between values, default a space.
end:   string appended after the last value, default a newline.
flush: whether to forcibly flush the stream.
Type:      builtin_function_or_method

This docstring as it calls it (for documentation string) is a human-written piece of text which is there to help you, the programmer, know how to use the function.

The ? syntax is an IPython-specific thing but you can use the equivalent help function anywhere. If you run:

In [2]:
help(print)
Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.

then you should see a very similar output.

Throughout this chapter we will be learning how to make our own docstrings and how to create nice readable documentation web pages.

Documenting our own functions

Let's start by writing a simple function in a module by itself which we can import and use. To begin we'll explore this in the Python Console and then we'll move onto putting this code into a module. For this example we'll use the add_arrays function from previous courses. Type the following into the Python Console:

In [3]:
def add_arrays(x, y):
    z = []
    for x_elem, y_elem in zip(x, y):
        z.append(x_elem + y_elem)
    return z

To see what the documentation for this function is, we either type add_arrays? or:

In [4]:
help(add_arrays)
Help on function add_arrays in module __main__:

add_arrays(x, y)

By default, the only documentation available for a function is just a repeat of whatever we wrote on the def line, so we see the name of the function along with the parameters available for it.

If we want to give the user some more information, we can pass it is by putting a string as the first thing inside the function. By convention we use a triple-quoted string which both starts and ends with three " in a row as they allow you to have strings over multiple lines:

In [5]:
def add_arrays(x, y):
    """
    This function adds together each element of the two
    passed lists, returning the result in the returned list.
    """
    z = []
    for x_elem, y_elem in zip(x, y):
        z.append(x_elem + y_elem)
    return z

Now, when we ask for the documentation, we should see our docstring printed:

In [6]:
help(add_arrays)
Help on function add_arrays in module __main__:

add_arrays(x, y)
    This function adds together each element of the two
    passed lists, returning the result in the returned list.

You can write whatever text you like in the documentation string, the most important thing is that you give the users of your code the information they need. Useful information for a user of the function are things like:

  • What arguments it takes
  • What it returns
  • An example of how to use it

There are a number of different conventions of how to format documentation strings but a common one is the Google style which looks like:

In [7]:
def add_arrays(x, y):
    """
    This function adds together each element of the two passed lists.

    Args:
        x (list): The first list to add
        y (list): The second list to add

    Returns:
        list: the pairwise sums of ``x`` and ``y``.

    Examples:
        >>> add_arrays([1, 4, 5], [4, 3, 5])
        [5, 7, 10]
    """
    z = []
    for x_, y_ in zip(x, y):
        z.append(x_ + y_)

    return z

We can check that this works by again doing:

In [8]:
help(add_arrays)
Help on function add_arrays in module __main__:

add_arrays(x, y)
    This function adds together each element of the two passed lists.
    
    Args:
        x (list): The first list to add
        y (list): The second list to add
    
    Returns:
        list: the pairwise sums of ``x`` and ``y``.
    
    Examples:
        >>> add_arrays([1, 4, 5], [4, 3, 5])
        [5, 7, 10]

This is a lot more information and it might seem strange that the documentation is longer than the code it describes but it's very important that you give the user of your code all the information that they need in order to use it. Remember, your documentation is only going to be written once but it will be read many times so it's worth spending the time on it.

In this example we have given a short one-line description of what the function does. Then we explicitly listed all of the arguments to the function along with what type they expect. After that we specified the type and description of the return value. Finally, and importantly we give an example to the user of how the function can be called and the output that it will give. The >>> go in front of the line of calling Python code and the return value is on the line after.

You can find more examples of the Google documentation style in the official Sphinx documentation.

Documenting modules

As well as functions, we can document whole modules. To do this, we'll have to move our function into a file called arrays.py. From previous courses, you should remember that this will make a module called arrays which we can import.

To document a module you use the same triple-quoted string as in functions but this time it goes at the very top of the file.

In the file editor, make a file called arrays.py in the bestpractices folder and put the following in it:

arrays.py
"""
This module contains functions for manipulating and combining Python lists.
"""

def add_arrays(x, y):
    """
    This function adds together each element of the two passed lists.

    Args:
        x (list): The first list to add
        y (list): The second list to add

    Returns:
        list: the pairwise sums of ``x`` and ``y``.

    Examples:
        >>> add_arrays([1, 4, 5], [4, 3, 5])
        [5, 7, 10]
    """
    z = []
    for x_, y_ in zip(x, y):
        z.append(x_ + y_)

    return z

We can then import the module in the Python Console:

In [9]:
import arrays

If you get an import error, the Console may be looking in the wrong folder. Move to the correct folder using %cd bestpractices.

Once it is imported we can get the documentation for the function with:

In [10]:
help(arrays.add_arrays)
Help on function add_arrays in module arrays:

add_arrays(x, y)
    This function adds together each element of the two passed lists.
    
    Args:
        x (list): The first list to add
        y (list): The second list to add
    
    Returns:
        list: the pairwise sums of ``x`` and ``y``.
    
    Examples:
        >>> add_arrays([1, 4, 5], [4, 3, 5])
        [5, 7, 10]

But we can also get the documentation for the whole module with:

In [11]:
help(arrays)
Help on module arrays:

NAME
    arrays - This module contains functions for manipulating and combining Python lists.

FUNCTIONS
    add_arrays(x, y)
        This function adds together each element of the two passed lists.
        
        Args:
            x (list): The first list to add
            y (list): The second list to add
        
        Returns:
            list: the pairwise sums of ``x`` and ``y``.
        
        Examples:
            >>> add_arrays([1, 4, 5], [4, 3, 5])
            [5, 7, 10]

FILE
    /home/matt/projects/courses/software_engineering_best_practices/arrays.py


You'll see in this case that it's showing the overall module docstring as well as those for the functions inside it.

Exercise

Run the example code from the documentation in the Console. Make sure that you are seeing the same output as shown in the docs.

  • hint: remember that the add_arrays function is inside the arrays module so you either have to import it as from arrays import add_arrays or run it with arrays.add_arrays.

Documentation-driven-development

It's always worth writing some documentation for each of your functions but you can go a step further and use a method known as documentation-driven-development. In this model you write the function signature and documentation for the function first, before writing any of the code inside it. This encourages you to think ahead of time about exactly what your function will do, how it will be called by users and what you expect it to return.

Generating documentation web pages

As well as viewing your documentation in the Python Console, it's possible to automatically create web pages to share your documentation. This is not a necessary part of this course but if you are interested later, have a look at the appendix on the tool Sphinx.