Sharing information with containers

Last updated on 2024-10-16 | Edit this page

Now that we have learned the basics of the Docker CLI, getting set up with all the tools we came across in Docker Desktop, we can start to explore the full power of Docker!

Overview

Questions

How do I get information in and out of containers?

Objectives

Learn how to use files with container using mounts and volumes
Learn how to set environment variables and pass parameters to containers

Making our data persist

In the earlier sections we interacted with the SPUC container and made changes to the print.config file. We also registered some unicorn sightings using the API, which were recorded in the unicorn_sightings.txt file. However, we lost all those changes when we stopped the container.

Docker containers are naturally isolated from the host system, meaning that they have their own filesystem, and cannot access the host filesystem. They are also designed to be temporary, and are destroyed when they are stopped.

This is mostly a good thing, as it means that containers are lightweight and can be easily recreated, but we can’t be throwing our unicorn sightings away like this!

Also, with the file being in the container, we can’t (easily) do much with it. Luckily, Docker has methods for allowing containers to persist data.

Volumes

One way to allow a container to access the host filesystem is by using a volume. A volume is a specially designated directory hidden away deep in the host filesystem. This directory is shared with the container.

Volumes are very tightly controlled by Docker. They are designed to be used for sharing data between containers, or for persisting data between runs of a container.

Let’s have a look at how we can use a volume to persist the unicorn_sightings.txt file between runs of the container. We do this by modifying our run command to include a -v (for volume) flag, a volume name and a path inside the container.

BASH

docker run -d --rm --name spuc_container -p 8321:8321 -v spuc-volume:/spuc/output spuacv/spuc:latest

OUTPUT

f1bd2bb9062348b6a1815f5076fcd1b79e603020c2d58436408c6c60da7e73d2

Ok! But what is happening? We can see what containers we have created using:

BASH

docker volume ls

OUTPUT

local     spuc-volume

Inspecting the volume

We can see more information about the volume using:

BASH

docker volume inspect spuc-volume

OUTPUT

[
    {
        "CreatedAt": "2024-10-11T11:15:09+01:00",
        "Driver": "local",
        "Labels": null,
        "Mountpoint": "/var/lib/docker/volumes/spuc-volume/_data",
        "Name": "spuc-volume",
        "Options": null,
        "Scope": "local"
    }
]

Which shows us that the volume is stored in /var/lib/docker/volumes/spuc-volume/_data on the host filesystem. You can visit and edit files there if you have superuser permissions (sudo).

But what about the container? Has this actually worked?

First… what’s that over there?? A unicorn! No… three unicorns! Let’s record these sightings.

BASH

curl -X PUT localhost:8321/unicorn_spotted?location=moon\&brightness=100
curl -X PUT localhost:8321/unicorn_spotted?location=earth\&brightness=10
curl -X PUT localhost:8321/unicorn_spotted?location=mars\&brightness=400

OUTPUT

{"message":"Unicorn sighting recorded!"}
{"message":"Unicorn sighting recorded!"}
{"message":"Unicorn sighting recorded!"}

Ok, let’s check the sightings file.

BASH

docker exec spuc_container cat /spuc/output/unicorn_sightings.txt

OUTPUT

count,time,location,brightness,units
0,2024-10-16 09:14:17.719447,moon,100,iuhc
1,2024-10-16 09:14:17.726706,earth,10,iuhc
2,2024-10-16 09:14:17.732191,mars,400,iuhc

Now, for our test, we will stop the container. Since we used the -rm flag, the container will also be deleted.

BASH

docker stop spuc_container
docker ps -a

OUTPUT

spuc_container
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

This would have been game over, but we used a volume. Let’s run it again and check the sightings file.

BASH

docker run -d --rm --name spuc_container -p 8321:8321 -v spuc-volume:/spuc/output spuacv/spuc:latest
docker exec spuc_container cat /spuc/output/unicorn_sightings.txt

OUTPUT

536a6d2f73061aa94729df3536ee86b60dcd68f4652bfbdc9e4cfa9c6cfda168
count,time,location,brightness,units
0,2024-10-16 09:14:17.719447,moon,100,iuhc
1,2024-10-16 09:14:17.726706,earth,10,iuhc
2,2024-10-16 09:14:17.732191,mars,400,iuhc

It’s worked! The unicorn sightings are still there! The only problem is that the file is still in the container, and we can’t easily access it from the host filesystem.

Bind mounts

Another way to allow a container to access the host filesystem is by using a bind mount. A bind mount is a direct mapping of a specified directory on the host filesystem to a directory in the container filesystem. This allows you to directly access files on the host filesystem from the container, but it has its own challenges.

Let’s have a look at how we can use a bind mount to persist the unicorn_sightings.txt file between runs of the container. Confusingly, bind mounting is also done using the -v flag. However, instead of a name for the volume, we have to specify a path on the host filesystem.

Note: In older versions of Docker the path had to be absolute; relative paths are now supported.

BASH

docker stop spuc_container
docker run -d --rm --name spuc_container -p 8321:8321 -v ./spuc/output:/spuc/output spuacv/spuc:latest

OUTPUT

spuc_container
79620ff93fdd8135dcc7f595223144c075a9df53fc32f2ce799ee8e338b9df41

The directory spuc/output likely did not exist in your current working directory, so Docker created one. It is currently empty, as you can see by listing the contents with ls spuc/output. If we now record a unicorn sighting, we can see the records file in the directory.

BASH

curl -X PUT localhost:8321/unicorn_spotted?location=mars\&brightness=400
cat spuc/output/unicorn_sightings.txt

OUTPUT

{message:"Unicorn sighting recorded!"}
count,time,location,brightness,units
0,2024-10-16 10:31:22.222542,mars,400,iuhc

and the file is still there even after stopping the container

BASH

docker stop spuc_container
ls spuc/output

OUTPUT

spuc_container
unicorn_sightings.txt

If we run the container again, we can see the file is still there.

BASH

docker run -d --rm --name spuc_container -p 8321:8321 -v ./spuc/output:/spuc/output spuacv/spuc:latest
cat spuc/output/unicorn_sightings.txt

OUTPUT

3dd079c21845fc36ddc3b20fd525790a1e194c198c4b98337f4ed82bfc7a9755
count,time,location,brightness,units
0,2024-10-16 10:31:22.222542,mars,400,iuhc

So we not only managed to persist the data between runs of the container, but we can also access the file when the container is not running. This is great!… but there are downsides.

To illustrate this, let’s see what the permissions are on the file we just created.

BASH

ls -l spuc/unicorn_sightings.txt

OUTPUT

-rw-r--r-- 1 root root 57 Oct 11 14:14 spuc/unicorn_sightings.txt

Argh, the file is owned by root! This is because the container runs as root, and so any files created by the container are owned by root. This can be a problem, as you will not have permission to access the file without using sudo.

This is a common problem with bind mounts, and can be a bit of a pain to deal with. You can change the ownership of the file using sudo chown, but this can be a bit of a hassle.

Additionally, it is hard for Docker to clean up bind mounts, as they are not managed by Docker. The management of bind mounts is left to the user.

Really, neither volumes nor bind mounts are perfect, but they are both useful tools for persisting data between runs of a container.

Bind mount files

Earlier, we looked at how to change the print.config file in SPUC to format the logs. This was a bit difficult, as we had to do it from inside the container, and it did not persist between runs of the container.

We now have the tools to address this! We can use a bind mount to share the config file with the container.

First we need to make the config file itself. Let’s create a file with the following content:

BASH

echo "::::: {time} Unicorn number {count} spotted at {location}! Brightness: {brightness} {units}" > print.config

Now, to share it with the container, we need to put it in the path /spuc/config/print.config. Again we will use -v, but we will specify the path to the file, instead of a directory.

BASH

docker stop spuc_container
docker run -d --rm --name spuc_container -p 8321:8321 -v ./print.config:/spuc/config/print.config -v spuc-volume:/spuc/output spuacv/spuc:latest

Now let’s check if this worked. For that, we need to record another sighting and then check the logs.

BASH

curl -X PUT localhost:8321/unicorn_spotted?location=jupyter\&brightness=100
docker logs spuc_container

OUTPUT

{"message":"Unicorn sighting recorded!"}
[...]
::::: 2024-10-16 10:53:13.449393 Unicorn number 4 spotted at jupyter! Brightness: 100 iuhc

Fantastic! We have now managed to share a file with the container. Not only that, but because we created the file before mounting it to the container, we are the owners, and can modify it. Changes to the file will reflect immediately on the container.

For example, let’s edit the file to get rid of the date:

BASH

echo "::::: Unicorn number {count} spotted at {location}! Brightness: {brightness} {units}" > print.config

Now let’s register a sighting, and look at the logs:

BASH

curl -X PUT localhost:8321/unicorn_spotted?location=venus\&brightness=148
docker logs spuc_container

OUTPUT

{"message":"Unicorn sighting recorded!"}
[...]
::::: Unicorn number 7 spotted at venus! Brightness: 148 iuhc

It almost seems too easy! We now have a print configuration and unicorn sighting record that persists between runs of the container.

Warning: We replaced the file in the container with the file from the host filesystem. We could do the same with a whole directory, but be careful not to overwrite important files in the container!

Common mistakes with volumes

You have to be really careful with the syntax for volumes and mounts.

Let’s imagine you are in a path containing a directory spuc, with an empty sub-directory output and a print.config file. What do you think will happen when you run the following commands?

docker run -v spuc-vol spuacv/spuc:latest
docker run -v ./spucs/output:/spuc/output spuacv/spuc:latest
docker run -v ./spuc-vol:/spuc/output spuacv/spuc:latest
docker run -v ./spuc:/spuc spuacv/spuc:latest
docker run -v print.config:/spuc/config/print.config spuacv/spuc:latest

Show me the solution

Problem: We provided a volume name, but not a path to mount it to. If the volume already existed, this will mount it on /spuc-vol. If the volume did not exist, it will create a directory /spuc-vol in the container, but it wont persist!
Fix: You only messed up the container, nothing to worry about. Stop it and try again.
Problem: You misspelled the path! This will create a new directory called spucs and mount it.
Fix: Use sudo rm -rf ./spucs to remove the directory and try again.
Problem: At first, it seems like we will create a volume. However, we have provided a path, not a name for the volume. Therefore, Docker thinks you want a bind mount, and will create a (root owned) directory called spuc-vol.
Fix: Use sudo rm -rf ./spuc-volume to remove the directory and try again.
Problem: This is valid syntax for a bind mount. It will take the almost empty spuc directory in your filesystem and mount it to /spuc in the container. However, it replaced everything in there in the process! Your command most likely failed because it could not find /spuc/spuc.py.
Fix: You only messed up the container, nothing to worry about. Try again.
Problem: We forgot to use a path for the file! This will try to create a new volume called print.config and mount it to /spuc/config/print.config. However, it will most likely fail because print.config is not a directory.
Fix: Use docker volume rm print.config to remove the volume and try again.

Setting the environment

One other interesting reading from the SPUC README is the presence of an environment variable, EXPORT which can be set to True to enable an API endpoint for exporting the unicorn sightings.

This sounds like a useful feature, but how can we set an environment variable in a container?

Thankfully this is quite straightforward, we can use the -e flag to set an environment variable in a container.

Modifying our run command again:

BASH

docker stop spuc_container
docker run -d --rm --name spuc_container -p 8321:8321 -v ./print.config:/spuc/config/print.config -v spuc-volume:/spuc/output -e EXPORT=true spuacv/spuc:latest
docker logs spuc_container

OUTPUT

[...]
::::: Initializing SPUC...
::::: Units set to Imperial Unicorn Hoove Candles [iuhc].

Welcome to the Space Purple Unicorn Counter!
::::: Try 'curl -X PUT localhost:8321/unicorn_spotted?location=moon\&brightness=100' to record a unicorn sighting!
::::: Or 'curl localhost:8321/export' to download the unicorn sightings file!

And now we can see that the export endpoint is available!

BASH

curl localhost:8321/export

OUTPUT

time,brightness,unit
2024-10-11 14:43:42.060883,100,iuhc
2024-10-11 14:43:48.064323,400,iuhc
2024-10-11 14:43:49.972220,10,iuhc

This is great! No need to bind mount or exec to get the data out of the container, we can just use the API endpoint.

Defaulting to network style connections is very common in Docker containers and saves a lot of hassle.

Environment variables are a very common tool for configuring containers. They are used to set things like API keys, database connection strings, and other configuration options.

Passing parameters

Finally, we must address a very serious shortcoming of the SPUC container. It is recording the brightness of the unicorns in Imperial Unicorn Hoove Candles (iuhc)! This is a very outdated unit and we must change it to metric.

Fortunately the SPUC README tells us that we can pass a parameter to the container to set the units to metric. This is done by passing a parameter to the container when it is run, overriding the default command.

BASH

docker stop spuc_container
docker run -d --rm --name spuc_container -p 8321:8321 -v ./print.config:/spuc/config/print.config -v spuc-volume:/spuc/output -e EXPORT=true spuacv/spuc:latest --units iulu
curl -X PUT localhost:8321/unicorn_spotted?location=earth\&brightness=10
curl localhost:8321/export/

OUTPUT

time,brightness,unit
2024-10-11 14:43:42.060883,100,iuhc
2024-10-11 14:43:48.064323,400,iuhc
2024-10-11 14:43:49.972220,10,iuhc
2024-10-11 15:30:27.823367,10,iulu

Callout

You can also override the entrypoint of a container using the --entrypoint flag. This is useful if you want to run a different command in the container, or if you want to run the container interactively.

You may recall:

BASH

docker inspect spuacv/spuc:latest -f "Entrypoint: {{.Config.Entrypoint}} Command: {{.Config.Cmd}}"
Entrypoint: [python /spuc/spuc.py] Command: [--units iuhc]

That SPUC has an entrypoint of python /spuc/spuc.py making it hard to interact with. We can override this using the --entrypoint flag.

BASH

docker run -it --rm --entrypoint /bin/sh spuacv/spuc:latest

Challenge

Which of these are valid entrypoint and command combinations for the SPUC container? What are the advantages and disadvantages of each?

	Entrypoint	Command
A	`python /spuc/spuc.py --units iuhc`
B	`python /spuc/spuc.py`	`--units iuhc`
C	`python`	`/spuc/spuc.py --units iuhc`
D		`python /spuc/spuc.py --units iuhc`

Show me the solution

These are all valid combinations! The best choice depends on the use case.

This combination bakes the command and the parameters into the image. This is useful if the command is always the same and the specified parameters are unlikely to change (although more may be appended).
This combination allows the command’s arguments to be changd easily while baking in which Python script to run.
This combination allows the Python script to be changed easily, which is more likely to be bad than good!
This combination allows maximum flexibility, but it requires the user to know the correct command to run.

Summary

In this section, we have learned about volumes and bind mounts, two ways to allow a container to access the host filesystem. We have used these to persist data between runs of a container and to share files with a container.

We have learned how to set environment variables and pass parameters to containers, two ways to configure the behaviour of a container.

SPUC is now running with the correct units and we can export the unicorn sightings using the API endpoint! And we are no longer losing our unicorn sightings between runs of the container.

Key Points

Volumes and bind mounts are two ways to allow a container to access the host filesystem.
Environment variables and parameters can be used to configure the behaviour of a container.