Creating Your Own Container Images
Last updated on 2025-10-21 | Edit this page
Estimated time: 60 minutes
Overview
Questions
- How can I create my own container images?
- What is a
Dockerfile?
Objectives
- Learn how to create your own container images using a
Dockerfile. - Introduce the core instructions used in a
Dockerfile. - Learn how to build a container image from a
Dockerfile. - Learn how to run a container from a local container image.
The SPUC documentation just keeps on giving, let’s keep the streak going!
There is another cool feature on there that we haven’t used yet - the ability to add new unicorn analysis features using plugins! Let’s try that out.
The docs says that we need to add a Python file at
/spuc/plugins/ that defines an endpoint for the new
feature.
It would be very handy to be able to get some basic statistics about our Unicorns. Let’s add a new plugin that will return a statistical analysis of the brightness of the unicorns in the database.
First lets make a file stats.py with the following
content:
PYTHON
# stats.py
from __main__ import app
from __main__ import file_path
import pandas as pd
import os
@app.route("/stats", methods=["GET"])
def stats():
if not os.path.exists(file_path):
return {"message": "No unicorn sightings yet!"}
with open(file_path) as f:
df = pd.read_csv(f)
df = df.iloc[:, 1:]
stats = df.describe()
return stats.to_json()
Don’t worry if you’re not familiar with Python or Pandas.
Understanding this snippet of code is not our aim. The code will return
some statistics about the data in file_path.
We already know how to load this file. Let’s use a bind mount to
share the file with the container. Since we are debugging, we’ll leave
out the -d flag so we can see the output easily.
BASH
docker kill spuc_container
docker run --rm --name spuc_container -p 8321:8321 -v ./print.config:/spuc/config/print.config -v spuc-volume:/spuc/output -v ./stats.py:/spuc/plugins/stats.py -e EXPORT=true spuacv/spuc:latest --units iulu
OUTPUT
[...]
Traceback (most recent call last):
File "/spuc/spuc.py", line 31, in <module>
__import__(f"{plugin_dir}.{plugin[:-3]}")
File "/spuc/plugins/stats.py", line 4, in <module>
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
Oh… well what can we do about this? Clearly we need to install the
pandas package in the container but how do we do that? We
could do this interactively, but we know that won’t survive a
restart!
Really what we need to do is change the image
itself, so that it has pandas installed by default. This
takes us to one of the most fundamental features of Docker - the ability
to create your own container images.
Creating Docker Images
So how are images made? With a recipe!
Images are created from a text file that contains a list of
instructions, called a Dockerfile. The instructions are
terminal commands, and build the container image up layer by layer.
All Dockerfiles start with a FROM instruction.
This sets the base image for the container. The base image is
the starting point for the container, and all subsequent instructions
are run on top of this base image.
You can use any image as a base image. There are
several official images available on Docker Hub which are very
commonly used. For example, ubuntu for general purpose
Linux, python for Python development, alpine
for a lightweight Linux distribution, and many more.
But of course, the most natural fit for us right now is to use the SPUC image as a base image. This way we can be sure that our new image will have all the dependencies we need.
Let’s create a new file called Dockerfile and add the
following content:
This is the simplest possible Dockerfile - it just says that our new image will be based on the SPUC image.
But what do we do with it? We need to build the image!
To do this we use the docker build command (short for
docker image build). This command takes a Dockerfile and
builds a new image from it. Just as when saving a file, we also need to
name the image we are building. We give the image a name with the
-t (tag) flag:
OUTPUT
[+] Building 0.0s (5/5) FINISHED docker:default
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 61B 0.0s
=> [internal] load metadata for docker.io/spuacv/spuc:latest 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [1/1] FROM docker.io/spuacv/spuc:latest 0.1s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:ccde35b1f9e872bde522e9fe91466ef983f9b579cffc2f457bff97f74206e839 0.0s
=> => naming to docker.io/library/spuc-stats 0.0s
Congratulations, you have now built an image! The command built a new
image called spuc-stats from the Dockerfile in
the current directory.
By default, the docker build command looks for a file
called Dockerfile in the path specified by the last
argument.
This last argument is called the build context, and it must be the path to a directory.
It is very common to see . or ./ used as
the build context, both of which refer to the current directory.
All of the instructions in the Dockerfile are run as if
we were in the build context directory.
If you now list the images on your system you should see the new
image spuc-stats listed:
OUTPUT
spuacv/spuc latest ccde35b1f9e8 25 hours ago 137MB
spuc-stats latest 21210c129ca9 5 minutes ago 137MB
We can now run this image in the same way we would run any other image:
OUTPUT
\
\
\\
\\\
>\/7
_.-(º \
(=___._/` \ ____ ____ _ _ ____
) \ |\ / ___|| _ \| | | |/ ___|
/ / ||\ \___ \| |_) | | | | |
/ > /\\\ ___) | __/| |__| | |___
j < _\ |____/|_| \____/ \____|
_.-' : ``.
\ r=._\ `. Space Purple Unicorn Counter
<`\\_ \ .`-.
\ r-7 `-. ._ ' . `\
\`, `-.`7 7) )
\/ \| \' / `-._
|| .'
\\ (
>\ >
,.-' >.'
<.'_.''
<'
Welcome to the Space Purple Unicorn Counter!
:::: Units set to Imperial Unicorn Hoove Candles [iuhc] ::::
:: Try recording a unicorn sighting with:
curl -X PUT localhost:8321/unicorn_spotted?location=moon\&brightness=100
:: No plugins detected
So we have a copy of the SPUC image with a new name, but nothing has
changed! In fact, we can pass all the same arguments to the
docker run command as we did before:
BASH
docker run --rm --name spuc-stats_container -p 8321:8321 -v ./print.config:/spuc/config/print.config -v spuc-volume:/spuc/output -v ./stats.py:/spuc/plugins/stats.py -e EXPORT=true spuc-stats --units iulu
OUTPUT
Traceback (most recent call last):
File "/spuc/spuc.py", line 31, in <module>
__import__(f"{plugin_dir}.{plugin[:-3]}")
File "/spuc/plugins/stats.py", line 4, in <module>
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
We are back where we were, but we can now start to make this container image our own!
Let’s first fix that dependency problem. We do this by adding a
RUN instruction to the Dockerfile. This
instruction runs a command in the container and then saves the result as
a new layer in the image. In this case we want to install the
pandas package so we add the following lines to the
Dockerfile:
This will install the pandas package in the container
using Python’s package manager pip. Now we can build the
image again:
OUTPUT
[+] Building 11.1s (6/6) FINISHED docker:default
[...]
=> CACHED [1/2] FROM docker.io/spuacv/spuc:latest 0.0s
=> [2/2] RUN pip install pandas 10.5s
=> exporting to image 0.4s
=> => exporting layers 0.4s
=> => writing image sha256:e548b862a5c4dd91551668e068d4ad46e6a25d3a3dbed335e780a01f954a2c26 0.0s
=> => naming to docker.io/library/spuc-stats 0.0s
You might have noticed a warning about running pip as
the root user. We are building a container image, not installing
software on our host machine, so we can ignore this warning.
Let’s run the image again:
BASH
docker run --rm --name spuc-stats_container -p 8321:8321 -v ./print.config:/spuc/config/print.config -v spuc-volume:/spuc/output -v ./stats.py:/spuc/plugins/stats.py -e EXPORT=true spuc-stats --units iulu
OUTPUT
[...]
Welcome to the Space Purple Unicorn Counter!
[...]
:::: Plugins loaded! ::::
:: Available plugins
stats.py
[...]
It worked! We no longer get the error about the missing
pandas package, and the plugin is loaded!
Let’s try out the new endpoint (you may want to do this from another
terminal, or exit with Ctrl+C and re-run with
-d first):
OUTPUT
{"brightness":{"count":6.0,"mean":267.3333333333,"std":251.7599385658,"min":18.0,"25%":93.75,"50%":219.5,"75%":344.5,"max":709.0}}
And there we have it! We have created our own container image with a new feature!
But why stop here? We could keep modifying the image to make it more how we would like by default.
COPY
It is a bit annoying having to bind mount the stats.py
file every time we run the container. This makes sense for development,
because we can potentially modify the script while the container runs,
but we would like to distribute the image with the plugin already
installed.
We can add this file to the image itself using the COPY
instruction. This copies files from the host machine into the container
image. It takes two arguments: the source file on the host machine and
the destination in the container image.
Let’s add it to the Dockerfile:
DOCKERFILE
# Dockerfile
FROM spuacv/spuc:latest
RUN pip install pandas
COPY stats.py /spuc/plugins/stats.py
Now we can build the image again:
OUTPUT
[...]
=> [1/3] FROM docker.io/spuacv/spuc:latest 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 287B 0.0s
=> CACHED [2/3] RUN pip install pandas 0.0s
=> [3/3] COPY stats.py /spuc/plugins/stats.py 0.0s
=> exporting to image 0.0s
[...]
You might have now noticed that on every build we are getting
messages like CACHED [2/3]... above.
Every instruction* in a Dockerfile creates a new layer
in the image.
Each layer is saved with a specific hash. If the set of instructions up to that layer remain unchanged, Docker will use the cached layer, instead of rebuilding it. This results in a lot of time and space being saved!
In the case above, we had already run the FROM and
RUN instructions in a previous build. Docker was able to
use the cached layers for those 2 instructions, and only had to do some
work for the COPY layer.
And run the image again, but this time without the bind mount for the
stats.py file:
BASH
docker run --rm --name spuc-stats_container -p 8321:8321 -v ./print.config:/spuc/config/print.config -v spuc-volume:/spuc/output -e EXPORT=true spuc-stats --units iulu
OUTPUT
[...]
Welcome to the Space Purple Unicorn Counter!
[...]
:::: Plugins loaded! ::::
:: Available plugins
stats.py
[...]
The plugin is still loaded!
And again… why stop there? We’ve already configured the print how we like it, so lets add it to the image as well!
DOCKERFILE
# Dockerfile
FROM spuacv/spuc:latest
RUN pip install pandas
COPY stats.py /spuc/plugins/stats.py
COPY print.config /spuc/config/print.config
Now we rebuild and re-run (without the bind mount for
print.config):
BASH
docker build -t spuc-stats ./
docker run --rm --name spuc_container -p 8321:8321 -v spuc-volume:/spuc/output -e EXPORT=True spuc-stats --units iulu
OUTPUT
[...]
=> [1/4] FROM docker.io/spuacv/spuc:latest 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 152B 0.0s
=> CACHED [2/4] RUN pip install pandas 0.0s
=> CACHED [3/4] COPY stats.py /spuc/plugins/stats.py 0.0s
=> [4/4] COPY print.config /spuc/config/print.config 0.0s
=> exporting to image 0.0s
[...]
Welcome to the Space Purple Unicorn Counter!
[...]
OOh! a unicorn! lets record it!
OUTPUT
{"message":"Unicorn sighting recorded!"}
and the logs confirm copying the print config worked:
OUTPUT
[...]
::::: Unicorn number 7 spotted at saturn! Brightness: 87 iulu
The run command is definitely improving! Is there
anything else we can do to make it even better?
ENV
We can also set environment variables in the Dockerfile
using the ENV instruction. These can always be overridden
when running the container, as we have done ourselves, but it is useful
to set defaults. We like the EXPORT variable set to
True, so let’s add that to the Dockerfile:
DOCKERFILE
# Dockerfile
FROM spuacv/spuc:latest
RUN pip install pandas
COPY stats.py /spuc/plugins/stats.py
COPY print.config /spuc/config/print.config
ENV EXPORT=True
Rebuilding and running (without the -e EXPORT=True flag)
results in:
BASH
docker build -t spuc-stats ./
docker run --rm --name spuc-stats_container -p 8321:8321 -v spuc-volume:/spuc/output spuc-stats --units iulu
OUTPUT
[...]
=> [1/4] FROM docker.io/spuacv/spuc:latest 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 61B 0.0s
=> CACHED [2/4] RUN pip install pandas 0.0s
=> CACHED [3/4] COPY stats.py /spuc/plugins/stats.py 0.0s
=> CACHED [4/4] COPY print.config /spuc/config/print.config 0.0s
=> exporting to image 0.0s
[...]
Welcome to the Space Purple Unicorn Counter!
[...]
:::: Unicorn sightings export activated! ::::
:: Try downloading the unicorn sightings record with:
curl localhost:8321/export
The EXPORT variable is now set to True by
default!
ARG
There is another instruction called ARG that is used to
set variables in the Dockerfile. These variables are only
available during the build process, and are not saved in the image.
You might have noticed that the ENV instruction did not
create a new layer in the image.
This instruction is a bit special, as it only modifies the configuration of the image. The environment is set on every instruction of the dockerfile, so it is not saved as a separate layer.
However, environment variables can have an effect on
instructions bellow it. Because of this, moving the ENV
instruction will change the layers, and the cache is no longer
valid.
We can see this by moving the ENV instruction in our
Dockerfile before the RUN command:
DOCKERFILE
# Dockerfile
FROM spuacv/spuc:latest
ENV EXPORT=True
RUN pip install pandas
COPY stats.py /spuc/plugins/stats.py
COPY print.config /spuc/config/print.config
If we now try to build again, we will get this output:
OUTPUT
[+] Building 10.4s (9/9) FINISHED docker:default
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 187B 0.0s
=> [internal] load metadata for docker.io/spuacv/spuc:latest 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> CACHED [1/4] FROM docker.io/spuacv/spuc:latest 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 61B 0.0s
=> [2/4] RUN pip install pandas 9.8s
=> [3/4] COPY stats.py /spuc/plugins/stats.py 0.0s
=> [4/4] COPY print.config /spuc/config/print.config 0.0s
=> exporting to image 0.5s
=> => exporting layers 0.5s
=> => writing image sha256:5a64cc132a7cbbc532b9e97dd17e5fb83239dfe42dae9e6df4d150c503d73691 0.0s
=> => naming to docker.io/library/spuc-stats 0.0s
As you can see, the first layer is cached, but everything after the
ENV instruction is rebuilt. Our environment variable has
absolutely no effect on the RUN instruction, but Docker
does not know that. The only thing that matters is that it
could have had an effect.
It is therefore recommended that you put the ENV
instructions only when they are needed.
A similar thing happens with the ENTRYPOINT and
CMD instructions, which we will cover next. Since these are
not needed at all during the build, they are best placed at the end of
the Dockerfile.
ENTRYPOINT and CMD
We’re on a bit of a roll here! Let’s add one more modification to the image. Let’s change away from those imperial units by default.
We can do this by changing the default command in the
Dockerfile. As you may remember, the default command is
composed of an entrypoint and a command. We can modify
either of them in the Dockerfile. Just to make clear wheat the full
command is directly from our dockerfile, lets write down both:
DOCKERFILE
# Dockerfile
FROM spuacv/spuc:latest
ENV EXPORT=True
RUN pip install pandas
COPY stats.py /spuc/plugins/stats.py
COPY print.config /spuc/config/print.config
ENTRYPOINT ["python", "/spuc/spuc.py"]
CMD ["--units", "iulu"]
Notice that we used an array syntax. Both the ENTRYPOINT
and CMD instructions can take a list of arguments, and the
array syntax ensures that the arguments are passed correctly.
Let’s give this a try, dropping the now unnecessary
--units iulu from the docker run command:
BASH
docker build -t spuc-stats ./
docker run --rm --name spuc-stats_container -p 8321:8321 -v spuc-volume:/spuc/output spuc-stats
OUTPUT
[...]
=> [1/4] FROM docker.io/spuacv/spuc:latest 0.0s
=> CACHED [2/4] RUN pip install pandas 0.0s
=> CACHED [3/4] COPY stats.py /spuc/plugins/stats.py 0.0s
=> CACHED [4/4] COPY print.config /spuc/config/print.config 0.0s
=> exporting to image 0.0s
[...]
:::: Units set to Intergalactic Unicorn Luminosity Units [iulu] ::::
[...]
Much better! A far cleaner command, much more customised for our use case!
Building containers from the ground up
In this lesson we adjusted the SPUC image, which already contains a service. This is a perfectly valid way of using Dockerfiles! But it is not the most common.
While you can base your images on any other public image, it is most common for developers to be creating containers ‘from the ground up’.
The most common practice is creating images from images like
ubuntu or alpine and adding your own software
and configuration files. An example of this is how the developers of the
SPUC service created their image. The Dockerfile is reproduced
below:
DOCKERFILE
FROM python:3.12-slim
RUN apt update
RUN apt install -y curl
WORKDIR /spuc
COPY ./requirements.txt /spuc/requirements.txt
RUN pip install --no-cache-dir --upgrade -r /spuc/requirements.txt
COPY ./*.py /spuc/
COPY ./config/*.config /spuc/config/
RUN mkdir /spuc/output
EXPOSE 8321
ENTRYPOINT ["python", "/spuc/spuc.py"]
CMD ["--units", "iuhc"]
From this we can see the developers:
- Started
FROMapython:3.12-slimimage - Use
RUNto install the required packages -
COPYthe source code and configuration files - Set the default
ENTRYPOINTandCMD.
There are also two other instructions in this Dockerfile that we haven’t covered yet.
-
WORKDIRsets the working directory for the container. It is used to create a directory and then change into it. You may have noticed before that when we exec into the SPUC container we start in the/spucdirectory. All of the commands after aWORKDIRinstruction are run from the directory it sets. -
EXPOSEis used to expose a port from the container to the host machine. This is not strictly necessary, but it is a good practice to document which ports the service uses.
- You can create your own container images using a
Dockerfile. - A
Dockerfileis a text file that contains a list of instructions to produce a container image. - Each instruction in a
Dockerfilecreates a newlayerin the image. -
FROM,WORKDIR,RUN,COPY,ENV,ENTRYPOINTandCMDare some of the most important instructions used in aDockerfile. - To build a container image from a
Dockerfileyou use the command:docker build -t <image_name> <context_path> - You can run a container from a local image just like any other image, with docker run.