Last week I showed you how to build a production-ready REST API using Python, FastAPI and Docker. And I promised you that today we would deploy it and make it accessible to the outside world.
However, I would like to show you one more thing before we enter into the deployment phase.
I want to show you how to write a better Dockerfile using Docker caching and multi-stage builds, so you build faster ⥠and ligher ðŠķ Docker images.
Let’s start!
You can find all the source code in this repository Give it a star â on Github to support my work
$ docker run -p 8090:8000 taxi-data-api-python:naive-build
However, it has 2 problems.
Problem #1
Every time you update your Python code in src/and rebuild the image, the Docker engine will re-install all the Python dependencies from your pyproject.toml file, even though you have not installed any new Python package!
And this takes A LOT of your precious time.
Solution ð§
Dockerfile instructions are layered as a stack, with each layer adding one more step in your build process.
Docker caches your layers. Meaning, Docker rebuilds a layer only if this layer or any previous layer in the Dockerfile has changed.
When you make changes to your source code, layer 6 changes, so
all previous layers are cached and don’t need to be rebuilt
all layers after it are rebuilt, including layer 7, where you re-install the Python dependencies from an unchanged pyproject.toml file.
So whenever you work on your Python code, without install new dependencies, and rebuild the container you won’t waste time re-installing the exact same dependencies in the Docker image.
BOOM!
Problem #2
Your Dockerfile nows builds fast, however, the final image size is pretty big (almost 1GB)
In our case, our Dockerfile has one build stage that installs all the necessary Python packages
# Stage 1: Build stage
FROM python:3.10-slim AS builder
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
# Set the working directory
WORKDIR /app
# Install Poetry
RUN pip install poetry
# Copy only the pyproject.toml and poetry.lock files to leverage Docker cache
COPY pyproject.toml poetry.lock README.md /app/
# Install dependencies
RUN poetry config virtualenvs.create false \
&& poetry install --no-dev --no-root
# Copy the rest of the application code
COPY . /app
and a final stage that copies the installed dependencies from the first stage, and launches the REST API server
# Stage 2: Runtime stage
FROM python:3.10-slim
# Set environment variables
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
# Set the working directory
WORKDIR /app
# Copy only the necessary files from the builder stage
COPY --from=builder /usr/local/lib/python3.10/site-packages /usr/local/lib/python3.10/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin
COPY --from=builder /app /app
# Expose the port the app runs on
EXPOSE 8000
# Run the application
CMD ["uvicorn", "src.api:app", "--host", "0.0.0.0", "--port", "8000"]
Next week I will (finally) show you how to deploy this API and make it accessible to the whole world. So you escape from the localhost hell, and start shipping ML software that others can use.
Talk to you soon,
Enjoy the weekend
Pau
Wanna learn more Real World ML?
Subscribe to my weekly newsletter
Every Saturday
For FREE
Join 22k+ ML engineers â
Stay Connected
Join my mailing list to receive free weekly tips and insights!