Docker Images
Last updated
Last updated
An image is an executable package that includes everything needed to run an application - the code, a runtime, libraries, environment, variables, and configuration file.
A container is a runtime instance of an image.
Container images can be pretty big (though some are really small, like alpine linux is 2.5MB). Ubuntu 16.04 is about 27MB, and the Anaconda Python distribution is 800MB to 1.5GB.
Every container you start with an image starts out with the same blank slate, as if it made a copy of the image just for that container to use. But for big container images, like that 800MB Anaconda image, making a copy would be both a waste of disk space and pretty slow. So Docker doesn’t make copies – instead it uses layering technique called overlay.
Overlay filesystems, also known as “union filesystems” or “union mounts” let you mount a filesystem using 2 directories: a “lower” directory, and an “upper” directory.
Basically:
the lower directory of the filesystem is read-only
the upper directory of the filesystem can be both readable and writable
When a process reads a file, the overlayfs filesystem driver looks in the upper directory and reads the file from there if it’s present. Otherwise, it looks in the lower directory.
When a process writes a file, overlayfs will just write it to the upper directory.
Images are made up of multiple read-only layers. Multiple containers are typically based on the same image. When an image is instantiated into a container, a top writable layer is created. (which is deleted when the container is removed)
Docker uses storage drivers to manage the content of the image layers and the writable container layer.Each storage driver handles the implementation differently, but all drivers use stackable image layers and the copy-on-write(CoW) strategy.
Copy-on-write is a strategy of sharing and copying files for maximum efficiency. If a file or directory exists in a lower layer within the image, and another layer (including the writable layer) needs read access to it, it just uses the existing file. The first time another layer needs to modify the file (when building the image or running the container), the file is copied into that layer and modified. This minimizes I/O and the size of each of the subsequent layers.
A Docker container consists of network settings, volumes, and images. The location of Docker files depends on your operating system. Here is an overview for the most used operating systems:
Ubuntu: /var/lib/docker/
Fedora: /var/lib/docker/
Debian: /var/lib/docker/
Windows: C:\ProgramData\DockerDesktop
MacOS: ~/Library/Containers/com.docker.docker/Data/vms/0/
use docker info | grep -i root
command to findout:
You might create your own images or you might only use those created by others and published in a registry. To build your own image, you create a Dockerfile with a simple syntax for defining the steps needed to create the image and run it. Each instruction in a Dockerfile creates a layer in the image. When you change the Dockerfile and rebuild the image, only those layers which have changed are rebuilt. This is part of what makes images so lightweight, small, and fast, when compared to other virtualization technologies. A Dockerfile is executed by the docker build command
Lets take a look at sample Dockerfile:
FROM: defines the base image; the FROM instruction must be the first instruction in Dockerfile.
LABEL: it's a Description about any thing you want to define about this image,
ADD: copies a file into the image but supports tar and remote URL
COPY: copy files into the image, preferred over ADD.
VOLUME: creates a mount point as defined when the container is run.
ENTRYPOINT: the executable runs when the container is run.
EXPOSE: documents the ports that should be published.
The CMD instruction has three forms:
CMD ["executable","param1","param2"]
(exec form, this is the preferred form)
CMD ["param1","param2"]
(as default parameters to ENTRYPOINT)
CMD command param1 param2
(shell form)
There can only be one CMD
instruction in a Dockerfile
. If you list more than one CMD
then only the last CMD
will take effect.
ENV: used to define environmental variables in the container.
MAINTAINER: (while deprecated), MAINTAINER is used to document the author of the Dockerfile (typically an email address)
ONBUILD: only used as a trigger when this image is used to build other images; will define commands to run "on build"
RUN: runs a new command in a new layer.
WORKDIR: defines the working directory of the container.
Now to build an image from this Dockerfile, we'll use the build command. The generic syntax for the command is as follows:
The build command requires a Dockerfile and the build's context. The context is the set of files and directories located in the specified location. Docker will look for a Dockerfile in the context and use that to build the image.
Open up a terminal window inside that directory and execute the following command:
Do not use your root directory, /
, as the PATH
as it causes the build to transfer the entire contents of your hard drive to the Docker daemon.
We're passing .
as the build context which means the current directory. If you put the Dockerfile inside another directory like /src/Dockerfile, then the context will be ./src
. The build process may take some time to finish:
If everything goes fine, you should see something like Successfully built fc32da11d651
at the end. This random string is the image id and not container id. try docker image inspect <image id>
to get information about this image , also to see layers which our image includes try docker image history <image id>
:
For listing local images, use the following syntax:
we can also use docker images
which is deprecated somehow.
The image we have recently built is showing up in the first line. We haven't tagged out image during build process ,we will talk about tagging images later in this section.
To download a particular image, or set of images, use docker pull :
As we mentioned Docker images can consist of multiple layers. In the example above, the image consists of two layers;
Use the docker images
command to locate the ID of the images you want to remove. When you’ve located the images you want to delete, you can pass their ID or tag to docker rmi
:
for example lets remove debian image :
you can not remove an image which is used by a stop container, you can use --force
for removing that but the stopped container(s) will be removed too!
Docker images consist of multiple layers. Dangling images are layers that have no relationship to any tagged images. They no longer serve a purpose and consume disk space. They can be located by adding the filter flag, -f
with a value of dangling=true
to the docker images
command. When you’re sure you want to delete them, you can use the docker images purge
command:
Note: If you build an image without tagging it, the image will appear on the list of dangling images because it has no association with a tagged image. You can avoid this situation by providing a tag when you build, and you can retroactively tag an images with the docker tag command.
As we haven't tag our image lets tag it before purging dangling images.
docker image prune -a
will remove all images with out at least one container associate with them, the good news about this is that if you have images that are being used by containers those images won't be deleted.
In simple words, Docker tags adds useful information about a specific image version/variant. They are aliases to the ID of your image which often look like this: f1477ec11d12
. It’s just a way of referring to your image. A good analogy is how Git tags refer to a particular commit in your history.
The two most common cases where tags come into play are:
When building an image, we use the following command:
It tells the Docker daemon to fetch the Docker file present in the current directory (that’s what the .
at the end does). Next, we tell the Docker daemon to build the image and give it the specified tag.
If you need to push your image to a registry use
docker build -t username/image_name:tag_name .
:
2.Explicitly tagging an image through the tag
command:
It just creates an alias (a reference) by the name of the TARGET_IMAGE
that refers to the SOURCE_IMAGE.
That’s all it does. It’s like assigning an existing image another name to refer to it. Notice how the tag is specified as optional here as well, by the [:TAG]
:
Alright, now let’s uncover what happens when you don’t specify a tag while tagging an image. This is where the latest
tag comes into the picture. Whenever an image is tagged without an explicit tag, it’s given the latest
tag by default. It’s an unfortunate naming choice that causes a lot of confusion. But I like to think of it as the default tag that’s given to images when you don’t specify one.
A docker registery is a stateless, highly scalable application that stores and lets you distribute Docker images. Registries could be local (private) or cloud-base (private or public).
Examples of Docker Registries:
Docker Registry (local open-source registry)
Docker Trusted Registry(DTR) [Available in Docker Enterprise Edition]
Docker Hub [Default Registry]
The first thing to remember is any time you are going to use a registry you need to first log in to that registry:
You need to create an account in Docker Hub first.
If we had a docker local registry then it would be docker login localhost:5000
.
and when you finish your job , logout:
Use docker push to Push an image or a repository to a registry
Whether you are using a public or a private registry you can search that registry to find the image that you need. And that is what docker search
command does for us:
docker search has a very useful filtering option, you can filter output based on these conditions:
- stars=<numberOfStar>
- is-automated=(true|false)
- is-official=(true|false)
above command searches for official ubuntu images which have more that 90 stars. The --limit
flag limits the maximum number of results returned by a search.
Pushing to Docker Hub is great, but it does have some disadvantages:
Bandwidth - many ISPs have much lower upload bandwidth than download bandwidth.
Unless you’re paying extra for the private repositories, pushing equals publishing.
When working on some clusters, each time you launch a job that uses a Docker container it pulls the container from Docker Hub, and if you are running many jobs, this can be really slow.
Solutions to these problems can be to save the Docker container locally as a a tar archive, and then you can easily load that to an image when needed.
To save a Docker image after you have pulled, committed or built it you use the docker save
command. For example, lets save a local copy of the myapp
docker image we made:
Docker supports two different types of methods for saving container images to a single tarball:
docker save
- saves a non-running container image to a file
docker export
- saves a container’s running or paused instance to a file
If we want to load that Docker container from the archived tar file in the future, we can use the docker load command:
Loading an image using the load
command creates a new image including its history.
Importing a container as an image using the import
command creates a new image excluding the history which results in a smaller image size compared to loading an image.
When working with Docker images and containers, one of the basic features is committing changes to a Docker image. When you commit to changes, you essentially create a new image with an additional layer that modifies the base image layer.
For example let run a container based on nginx image :
Now lets attach to it and modify index.html:
Now lets ctrl+p and then ctlr+q to exit from the container without stopping that.
and finally lets creating a new image from this running container using commit command:
and see the result:
Now we can run as many containers as we like from this image.
.
https://blog.octo.com/wp-content/uploads/2014/01/docker-stages.png
https://jvns.ca/blog/2019/11/18/how-containers-work--overlayfs/
https://docs.docker.com/engine/images/architecture.svg
https://www.freecodecamp.org/news/the-docker-handbook/#creating-custom-images
https://docs.docker.com/storage/storagedriver/
https://www.freecodecamp.org/news/an-introduction-to-docker-tags-9b5395636c2a/
https://docs.docker.com/engine/reference/commandline/search/
https://ropenscilabs.github.io/r-docker-tutorial/04-Dockerhub.html
https://tecadmin.net/export-and-import-docker-containers/
https://github.com/wsargent/docker-cheat-sheet
https://phoenixnap.com/kb/how-to-commit-changes-to-docker-image
.