Dockerfile Deep Dive
This chapter will deeply explain Dockerfile syntax, instructions, and best practices, helping you master how to write efficient and secure Dockerfiles to build custom images.
Dockerfile Basics
What is a Dockerfile?
A Dockerfile is a text file containing a series of instructions used to automate the building of Docker images. Each instruction creates a new layer in the image.
Basic Structure of Dockerfile
dockerfile
# Comment
FROM base_image:tag
LABEL maintainer="your-email@example.com"
RUN command
COPY source destination
WORKDIR /app
EXPOSE 8080
CMD ["executable", "param1", "param2"]Build Context
The build context is the set of files and directories that the docker build command sends to the Docker daemon:
bash
# Current directory as build context
docker build -t myapp:v1.0 .
# Specify build context
docker build -t myapp:v1.0 /path/to/context
# Build from Git repository
docker build -t myapp:v1.0 https://github.com/user/repo.gitDockerfile Instructions Detailed
FROM - Base Image
dockerfile
# Basic usage
FROM ubuntu:20.04
# Use multi-stage builds
FROM node:16 AS builder
FROM nginx:alpine AS runtime
# Use ARG variable
ARG BASE_IMAGE=node:16
FROM ${BASE_IMAGE}
# Specify platform
FROM --platform=linux/amd64 node:16Best Practices:
- Use specific tags instead of
latest - Prefer official images
- Use lightweight base images (like Alpine)
RUN - Execute Commands
dockerfile
# Shell form (recommended for complex commands)
RUN apt-get update && apt-get install -y \
curl \
vim \
&& rm -rf /var/lib/apt/lists/*
# Exec form (recommended for simple commands)
RUN ["apt-get", "update"]
# Multi-line commands
RUN apt-get update \
&& apt-get install -y curl \
&& curl -sL https://deb.nodesource.com/setup_16.x | bash - \
&& apt-get install -y nodejs \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Use heredoc (Docker 20.10+)
RUN <<EOF
apt-get update
apt-get install -y curl
apt-get clean
rm -rf /var/lib/apt/lists/*
EOFBest Practices:
- Combine multiple RUN instructions to reduce layers
- Clean up cache and temporary files in the same layer
- Use
&&to connect commands to ensure stop on failure
COPY and ADD - Copy Files
dockerfile
# COPY basic usage
COPY app.js /usr/src/app/
COPY . /usr/src/app/
# COPY with ownership
COPY --chown=node:node . /usr/src/app/
# ADD (can extract compressed files)
ADD app.tar.gz /usr/src/app/
ADD https://example.com/file.tar.gz /tmp/Key Differences:
- COPY only copies local files
- ADD can extract compressed files and download from URLs
- Prefer COPY unless you need ADD's special features
WORKDIR - Working Directory
dockerfile
# Set working directory
WORKDIR /usr/src/app
# Use with environment variables
ENV APP_HOME=/usr/src/app
WORKDIR $APP_HOME
# Can be used multiple times
WORKDIR /usr/src/app
WORKDIR test
RUN npm testENV - Environment Variables
dockerfile
# Set environment variables
ENV NODE_ENV=production
ENV PORT=3000
ENV APP_VERSION=1.0.0
# Set multiple variables
ENV NODE_ENV=production \
PORT=3000 \
HOST=0.0.0.0
# Use variables in other instructions
RUN echo "Building for $NODE_ENV"EXPOSE - Expose Ports
dockerfile
# Expose port
EXPOSE 8080
# Expose multiple ports
EXPOSE 80 443
# Specify protocol
EXPOSE 80/tcp
EXPOSE 53/udpCMD and ENTRYPOINT - Container Startup Commands
dockerfile
# CMD (can be overridden)
CMD ["nginx", "-g", "daemon off;"]
CMD ["node", "app.js"]
CMD echo "Hello World"
# ENTRYPOINT (not easily overridden)
ENTRYPOINT ["docker-entrypoint.sh"]
ENTRYPOINT ["node", "app.js"]
# Combining ENTRYPOINT and CMD
ENTRYPOINT ["node"]
CMD ["app.js"]Key Differences:
- CMD provides default command that can be overridden
- ENTRYPOINT sets main command that is not easily overridden
- When both exist, CMD becomes parameters to ENTRYPOINT
USER - Set User
dockerfile
# Set user by name
USER node
# Set user by ID
USER 1000:1000
# Create user first
RUN groupadd -r appuser && useradd -r -g appuser appuser
USER appuserVOLUME - Data Volumes
dockerfile
# Create data volume
VOLUME ["/data"]
# Create multiple volumes
VOLUME ["/data", "/logs"]
# Specify volume with mount point
VOLUME /var/lib/postgresql/dataARG - Build Arguments
dockerfile
# Define build argument
ARG VERSION=latest
ARG BUILD_NUMBER
# Use in FROM instruction
ARG BASE_IMAGE=ubuntu:20.04
FROM ${BASE_IMAGE}
# Use in other instructions
ARG VERSION
RUN echo "Building version ${VERSION}"
# Set default value
ARG TARGETPLATFORM=linux/amd64LABEL - Metadata
dockerfile
# Single label
LABEL maintainer="developer@example.com"
# Multiple labels
LABEL version="1.0" \
description="My application" \
vendor="My Company"HEALTHCHECK - Health Monitoring
dockerfile
# Basic health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost/ || exit 1
# Using built-in command
HEALTHCHECK --interval=1m --timeout=3s \
CMD pg_isready -U postgres || exit 1
# Disable health check
HEALTHCHECK NONEMulti-stage Builds
Basic Multi-stage Build
dockerfile
# Build stage
FROM node:16-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Production stage
FROM node:16-alpine AS production
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./package.json
EXPOSE 3000
CMD ["node", "dist/app.js"]Advanced Multi-stage Build
dockerfile
# Dependencies stage
FROM node:16-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force
# Build stage
FROM node:16-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build
# Production stage
FROM node:16-alpine AS runtime
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
WORKDIR /app
COPY --from=builder --chown=appuser:appgroup /app/dist ./dist
COPY --from=deps --chown=appuser:appgroup /app/node_modules ./node_modules
COPY --from=builder --chown=appuser:appgroup /app/package.json ./package.json
USER appuser
EXPOSE 3000
CMD ["node", "dist/app.js"]Build Optimization
Layer Caching
dockerfile
# Good: Copy only package files first
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Bad: Copy everything at once (breaks cache)
COPY . .
RUN npm ci
RUN npm run buildReduce Image Size
dockerfile
# Use Alpine base image
FROM node:16-alpine
# Remove unnecessary packages in same layer
RUN apt-get update && apt-get install -y \
build-essential \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Use .dockerignore to exclude files.dockerignore File
dockerignore
# Exclude node_modules
node_modules
# Exclude git files
.git
.gitignore
# Exclude logs
*.log
logs/
# Exclude temp files
tmp/
*.tmp
# Exclude development files
.env.local
.env.developmentSecurity Best Practices
Non-root User
dockerfile
FROM node:16-alpine
# Create non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
# Set ownership
WORKDIR /app
COPY --chown=appuser:appgroup . .
# Switch to non-root user
USER appuserMinimal Permissions
dockerfile
# Use minimal base image
FROM alpine:3.14
# Install only necessary packages
RUN apk add --no-cache \
ca-certificates \
&& update-ca-certificates
# Remove package manager
RUN apk del --purge apk-toolsSecrets Management
dockerfile
# Use secrets (Docker 18.09+)
# docker build --secret id=mysecret,src=/local/secret .
FROM alpine
RUN --mount=type=secret,id=mysecret \
cat /run/secrets/mysecret
# Use build-time arguments for non-sensitive data
ARG APP_VERSION
ENV APP_VERSION=${APP_VERSION}Build Optimization Commands
bash
# Build with no cache
docker build --no-cache -t myapp .
# Build with specific platform
docker build --platform linux/amd64 -t myapp .
# Build with build arguments
docker build --build-arg VERSION=1.0 -t myapp .
# Use BuildKit (Docker 18.09+)
DOCKER_BUILDKIT=1 docker build -t myapp .
# Parallel builds
docker build --parallel -t myapp .Common Patterns
Node.js Application
dockerfile
FROM node:16-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force
FROM node:16-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build
FROM node:16-alpine AS runtime
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
WORKDIR /app
COPY --from=builder --chown=appuser:appgroup /app/dist ./dist
COPY --from=deps --chown=appuser:appgroup /app/node_modules ./node_modules
USER appuser
EXPOSE 3000
CMD ["node", "dist/app.js"]Python Application
dockerfile
FROM python:3.9-slim AS base
WORKDIR /app
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
FROM base AS deps
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
FROM base AS runtime
COPY --from=deps /usr/local/lib/python3.9/site-packages /usr/local/lib/python3.9/site-packages
COPY --from=deps /usr/local/bin /usr/local/bin
COPY . .
EXPOSE 8000
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "app:app"]Chapter Summary
This chapter deeply explained Dockerfile:
Key Points:
- Basic Instructions: FROM, RUN, COPY, WORKDIR, etc.
- Build Optimization: Layer caching, multi-stage builds
- Security Best Practices: Non-root users, minimal base images
- Advanced Features: Build arguments, health checks
- Common Patterns: Application-specific Dockerfiles
Best Practices:
- Use specific image tags
- Implement multi-stage builds
- Optimize layer caching
- Use .dockerignore
- Run as non-root user
- Keep images small and secure
In the next chapter, we will learn about image management best practices and optimization techniques.