Defensive Programming
Last updated on 2026-04-24 | Edit this page
Overview
Questions
- How can I catch errors in my data?
- How can I make best use of error messages?
Objectives
- Learn how to catch and deal with errors within your code.
- Learn how to catch problematic data before it is used in your code.
In this episode we are going to learn about error handling, a python structure (also common in other programming languages) which will help us to properly manage the anomalies that can be encountered when executing our code.
In science most developers write code for themselves and do not feel the need to use error handling. However, nowadays scientists have to share their work as more funding agencies are also asking for the code used in experiments to be published.
More importantly, applications are not expected to crash and we are going to learn how to avoid this non desirable behaviour.
The test assumes that val is a number, and throws an
uncontrolled error if it is not.
PYTHON
val = 'a'
if val > 0:
print(f'Value: {val} is positive.')
elif val == 0:
print(f'Value: {val} is zero.')
else:
print(f'Value: {val} is negative.')
OUTPUT
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-2-99c0e25bf5e9> in <module>()
1 def check_sign(val):
----> 2 if val > 0:
3 print(f'Value: {val} is positive.')
4 elif val == 0:
5 print(f'Value: {val} is zero.')
TypeError: '>' not supported between instances of 'str' and 'int'
We can avoid problems like this by wrapping our code in an
if statement. To make things simpler, we will first write
the test as a function:
PYTHON
def check_sign(val):
if val > 0:
print(f'Value: {val} is positive.')
elif val == 0:
print(f'Value: {val} is zero.')
else:
print(f'Value: {val} is negative.')
Then wrap the function call in an if statement:
PYTHON
if type(val) is int or type(val) is float:
check_sign(val)
else:
print('val is not a number')
However, this becomes tedious to implement very quickly, especially
when trying to cover a number of possible errors, which will all need
their own if clauses. This can lead to a lot of duplicated
code, making the code difficult to maintain.
Python provides the try-except structure to avoid this
issue, enabling developers to properly manage any possible errors. The
basic try-except structure is:
At the top of the statement is the code that we are interested in
executing, which is run in the try statement. If that fails
then the except statement comes into effect, (hopefully)
returning helpful information to the user about what happened and giving
them some guidance on how to avoid the problem in future.
Using try-except statements results in clearer, easier
to understand code by following the common Python coding style of EAFP (it’s
easier to ask for forgiveness than permission). This style shows the
code we want to execute first, assuming that the incoming data is
correct, before dealing with exceptions if the assumptions prove
false.
The except statement will catch all errors and so we do
not, initially at least, need to know exactly what errors we are trying
to avoid. However, python does provide error codes, which we can use to
expand the structure to capture specific error types. For the example
above, we would want to capture a TypeError:
PYTHON
try:
check_sign(val)
except TypeError as err:
print('Val is not a number')
print('But our code does not crash anymore')
print(f'The run-time error is: {err}')
As with if statements, multiple except
statements can be used, each with a different error test. These can be
followed by an (optional) catch-all bare except statement
(as we started with) to catch any unexpected errors. Note that only one
try statement is allowed in the structure.
try-except structures may also contain else
and finally statements (following in order after the
except statements). The else statement will be
executed if no except statements are executed, while the
finally statement will be executed after all other
statements are finished.
PYTHON
try:
check_sign(val)
reciprocal = 1/val
except TypeError as err:
print('Val is not a number')
print(f'The run-time error is: {err}')
except Exception as err:
print('Some error other than a TypeError occured')
print(f'The run-time error is: {err}')
else:
print(f'The reciprocal of the value = {reciprocal}')
finally:
print('release memory')
The typical use of the finally statement is to deal with
the release of external resources (such as files or network connections)
whether or not the attempted action has been successful.
More details about errors and exceptions can be found in the Python tutorials.
HTTP errors
Earlier we were introduced to the requests library. This
is used to access web content, and returns a code indicating the request
status. The raise_for_status function is provided for
testing the request status, which returns a
request.HTTPError if there is an error. For example, if we
request a missing webpage we get this error:
PYTHON
import requests
source_url='https://api.datacite.org/dois/10.48546/workflowhub.workflow.56.1000'
response = requests.get(source_url)
response.raise_for_status()
OUTPUT
Traceback (most recent call last):
File "<ipython-input-108-06c23faa067a>", line 3, in <module>
response.raise_for_status()
File "/Users/mbessdl2/anaconda3/envs/myenv/lib/python3.9/site-packages/requests/models.py", line 943, in raise_for_status
raise HTTPError(http_error_msg, response=self)
HTTPError: 404 Client Error: Not Found for url: https://api.datacite.org/dois/10.48546/workflowhub.workflow.56.1000
Write a try-except structure that uses
raise_for_status and deals with the error without
crashing.
PYTHON
import requests
try:
source_url='https://api.datacite.org/dois/10.48546/workflowhub.workflow.56.1000'
response = requests.get(source_url)
response.raise_for_status()
except requests.HTTPError as err:
print(err)
OUTPUT
404 Client Error: Not Found for url: https://api.datacite.org/dois/10.48546/workflowhub.workflow.56.1000
Assert
The try-except structure is useful for controlling small
chunks of code, where we might reasonably expect that the user can fix
the problem (where it is a program which takes a short time to run), or
where our program itself will fix the problem (such as substituting NaN
values for a failed calculation).
Sometimes, however, it is better to check the data before it is passed into the code and to trigger an early crash in the program, which returns a meaningful error. For example, this can be extremely useful in the case of large numerical calculations that take a long time to run and which will only use some of the provided variables after a significant amount of the computational work has already been carried out. In such situations your user will appreciate being warned early on that they have provided invalid data.
To conduct such a test we can use an assert statement.
This follows the structure:
assert <test>, <error message>. As an example
of this, we can write the test for non-numerical values as:
PYTHON
val = 'a'
assert type(val) is float or type(val) is int, 'Variable has to be a numerical object'
check_sign(val)
OUTPUT
Traceback (most recent call last):
File "<ipython-input-38-52d27526daab>", line 1, in <module>
assert type(val) is float or type(val) is int, "Variable has to be a numerical object"
AssertionError: Variable has to be a numerical object
Like the earlier if statement, this catches the
problematic variable before it reaches our calculation. However it is
far less intrusive code, and assert tests can be added in
sequence, so that we can quickly and easily increase the coverage of
such tests.
Warning when using optimisations
Using asserts when testing and running code while debugging is an
invaluable way of checking that your code is working. However, be aware
that if you are using assert in conjunction with
optimization, then there are some built-in variables which change value
when optimization is requested which have a knock-on effect on the
behaviour of an assert; for more information look at the
documentation on the
assert statement.
Testing for NaN’s (Not a Number)
Numpy provides Not a Number (nan) values, which can be used to indicate missing data. These are not caught by the test above:
PYTHON
import numpy as np
val = np.nan
assert type(val) is float or type(val) is int, "Variable has to be a numerical object"
check_sign(val)
OUTPUT
Value: nan is a number.
However numpy does provide the isnan() function for
testing if a value is NaN. Add an assertion test using this which will
raise and error if the value is NaN.
Because the test checks if a value is NaN, we need to add
not before the test, to ensure that the assert
statement fails if it is a NaN:
PYTHON
import numpy as np
val = np.nan
assert type(val) is float or type(val) is int, "Variable has to be a numerical object"
assert not np.isnan(val), "Variable must not be a NaN"
check_sign(val)
OUTPUT
Traceback (most recent call last):
File "<ipython-input-52-553a9c93f77a>", line 4, in <module>
assert not np.isnan(val), "Variable must not be a NaN"
AssertionError: Variable must not be a NaN
-
try-exceptstructures are useful for catching errors as they occur -
assertstructures are useful for forcing errors early, to avoid wasted effort