In the last article I stepped through creating a docker image for an existing application. It's a decent attempt at a docker container, but there are some things that can be improved. Here's the Dockerfile again:
FROM python:3.6 COPY . /app WORKDIR /app RUN pip install -r requirements.txt ENV PYTHONUNBUFFERED 1 ENF PYTHONDONTWRITEBYTECODE 1 EXPOSE 5000 ENTRYPOINT ["entrypoint.sh"] CMD ["dev"]
Mistake 1: Not using Docker's layer caching
The first improvement can be to move the copying and installation of requirements.txt before copying the rest of the application in:
FROM python:3.6 WORKDIR /app COPY ./requirements.txt /app/requirements.txt RUN pip install -r requirements.txt COPY . /app
With this change, Docker can cache the result of the pip installation and only start at the COPY . /app line when changes are made to the application code.
Mistake 2: Coarse grained volume mounting
This one doesn't affect the container as is, but very well could. Consider if the application was modified to be installable inside the container and we added
RUN pip install -e .
When the container is built a egg would be generated in /app. When we go to run the image through the docker-compose file with the .:/app volume mount, that egg disappears. Whoops.
The solution to this is to only mount exactly what you'll need into the container and no more. Even if that means you end up a somewhat lengthy volumes section (this actually telling you that you're linking too much). For me, I tend to mount only the actual source and the tests for the application. If I'm fiddling with stuff like entrypoints, tox configuration, etc I'll mount those files as needed and remove them when I'm done.
Mistake 3: Packaging too much into the image
Again, it doesn't affect this image but it certainly is something to be mindful of. With COPY . /app we bring everything into the image, including stuff we're not going to need at runtime. Things like:
- Development tools
- Build tools
If you have compiled dependencies, you're probably leaving things like development headers laying around as well. There's several solutions to this:
- Fine grained COPY commands
- .dockerignore will allow you to selective ignore things sent into the image, think of it like .gitignore for Docker
- Employing a builder pattern that installs all the needed development headers and uses an entrypoint and volume to create a wheelhouse for you to install from -- Glyph has an article on a Python builder image
- A cleanup script that removes everything you don't need at runtime
None of these are mutually exclusive and you should decide which works for your application and how. If this application had tests, it might have a directory structure like:
. ├── docker-compose.yml ├── Dockerfile ├── entrypoint.sh ├── isthatwho ├── README.md ├── requirements.txt ├── setup.py └── tests
We might change the command that copies the source code to COPY ./isthatwho /app/isthatwho to avoid picking up tests and add another to pick up the setup.py, requirements.txt entrypoint.sh, leaving everything else outside the image.
If you need to do stuff like run tests on a build server, you can either mount all the needed files into the container and run them or build a child image that includes them. The former works well if you're only running with Python's builtin unittest module:
docker run -v "$(pwd)"/tests:/app/tests itw python -m unittest /app/tests
Where as the child image works better if you need external dependencies like pytest, coverage, tox, etc. The issue here is if you build and tag images based on branch/commit/build number, you run into some interesting situations that way such as needing to generate a dockerfile. I've not put a whole lot of practice into this so I don't have good suggestions here, but I'd be surprised if I'm the first to think about it.
Mistake 4: Doing too much in one container
Lately, I've noticed a trend of one purpose per container rather a strict one process per container. I'm of two minds about this. On one hand, it's great to just shove a container onto a sever and know it has everything it needs to run without linking up to other hosts. On the other, it can be a huge pain for scaling services.
If you have a container that runs uwsgi and nginx together, then you're fucked when you want to scale up the application without adding more nginx workers. On top of that, you're left needing something to orchestrate all of those processes, so on top of uwsgi and nginx, you need an init system as well. And as you add more and more of these services to the single container, it bloats and bloats and soon you're left wondering "Why didn't I just use a VM in the first place?"
Instead, what you should do is stick to running as few processes inside a container as possible. If you have a container for uwsgi, celery, redis and nginx then you can scale each independently as needed, your images stay small, and there's less ops overhead with an init system.
I'll fully admit that this does complicate getting nginx and uwsgi to communicate
Mistake 5: Staying as root
This is probably the biggest issue I commonly see. At build and start up time, I can see the value in having the current user be root. You might need special permissions to run some commands.
However, when it comes time to run the process, being root is absolutely ridiculous. Mostly because the root in the container is the same root on the host OS. There's lots of ways to go poking around if you don't drop privileges. The most egregious is not dropping privileges on stuff like nginx or WSGI servers because these come with configurable ways to shed those privileges.
If your base image has the nobody user and group, I recommend taking advantage of that at runtime. If not, adding them is as easy as including this in your Dockerfile:
RUN groupadd nobody -r -g 1000 && useradd nobody -r -u 1000 -g nobody
That creates "system" groups and accounts (essentially they don't have a login, shell, or homedir), it also gives them known a known GID and UID in case you need them.
With this, you can safely run your processes as just some random user with no special privileges.
I spilled my brains, spill some of yours.