Docker Basics: Getting Started with Containerization

Docker is a platform for developing, shipping, and running applications in containers. Containers package software with all its dependencies, ensuring consistent operation across different environments.

Docker Basics: Getting Started with Containerization

Docker is a platform for developing, shipping, and running applications in containers. Containers allow developers to package an application with all its dependencies, libraries, and configuration files, ensuring it runs consistently across any environment. Unlike virtual machines that virtualize hardware, containers virtualize the operating system, sharing the host kernel while maintaining isolation. Docker has become the industry standard for containerization, enabling microservices architecture, CI/CD pipelines, and cloud-native development.

To understand Docker basics properly, it helps to be familiar with containerization concepts, Linux fundamentals, and virtualization.

Docker architecture overview:
┌─────────────────────────────────────────────────────────────────────────┐
│                         Docker Architecture                               │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                          │
│   Docker Client                    Docker Host                          │
│   ┌─────────────┐                 ┌─────────────────────────────────┐   │
│   │ docker build│                 │        Docker Daemon             │   │
│   │ docker pull │                 │      (dockerd)                   │   │
│   │ docker run  │                 └───────────────┬─────────────────┘   │
│   │ docker push │                                 │                     │
│   └──────┬──────┘                                 │                     │
│          │ REST API                               │                     │
│          └───────────────────────────────────────►│                     │
│                                                    ▼                     │
│                                          ┌─────────────────┐            │
│                                          │   Container     │            │
│                                          │   Container     │            │
│                                          │   Container     │            │
│                                          └─────────────────┘            │
│                                                    │                     │
│                                                    ▼                     │
│   Docker Registry                  ┌─────────────────────────────────┐  │
│   ┌─────────────────┐              │         Shared Kernel            │  │
│   │ Docker Hub      │◄────────────►│         (Linux)                  │  │
│   │ Private Registry│              └─────────────────────────────────┘  │
│   └─────────────────┘                                                   │
│                                                                          │
│   Key Components:                                                        │
│   • Docker Daemon – Manages containers, images, networks, volumes       │
│   • Docker Client – CLI tool for interacting with daemon                │
│   • Docker Registry – Stores and distributes images                     │
│   • Docker Objects – Images, containers, networks, volumes              │
│                                                                          │
└─────────────────────────────────────────────────────────────────────────┘

What Is Docker?

Docker is an open-source platform that automates the deployment of applications inside lightweight, portable containers. A container packages an application with all its dependencies, libraries, and configuration files, isolating it from the host system and other containers. Docker uses operating system-level virtualization (namespaces and cgroups) to provide isolation, making containers more lightweight than virtual machines.

  • Images: Read-only templates containing application code, runtime, libraries, and dependencies. Images are built from Dockerfiles and stored in registries.
  • Containers: Runnable instances of images. Containers are isolated processes with their own filesystem, network, and process space.
  • Dockerfile: Text file with instructions to build a Docker image. Defines base image, commands, files to copy, and entry point.
  • Registry: Storage and distribution system for Docker images. Docker Hub is the default public registry.
  • Volume: Persistent storage mechanism for containers, surviving container restarts and removal.

Why Docker Matters

Docker solves the classic "it works on my machine" problem by standardizing the environment across development, testing, and production.

  • Environment Consistency: Same container runs identically on developer laptop, CI server, and production. Eliminates environment-specific bugs. Reduces "works on my machine" excuses.
  • Isolation and Security: Containers isolate applications from each other and host system. Reduced dependency conflicts (different Python versions, library versions). Security boundaries (process, network, filesystem).
  • Portability: Container runs anywhere with Docker installed (Linux, Windows, macOS). Cloud providers support containers (AWS ECS, Azure ACI, Google Cloud Run). No vendor lock-in.
  • Resource Efficiency: Containers share host kernel, no per-container OS overhead. Start faster (milliseconds vs minutes for VMs). Higher density (more containers per server).
  • CI/CD Integration: Build once, deploy anywhere. Consistent testing environment. Easy rollback (previous image version).
  • Microservices Ready: Each service runs in own container. Independent deployment, scaling, and updates. Easy to mix languages and versions per service.
Container vs Virtual Machine comparison:
Aspect                  Container                         Virtual Machine
─────────────────────────────────────────────────────────────────────────────
Isolation Level         Process-level                     Hardware-level
OS                      Shared host kernel                Full OS per VM
Startup Time            Milliseconds                      Seconds to minutes
Disk Size               MB (image)                        GB (OS image)
Memory Overhead         Low (MB)                          High (GB)
Performance             Near-native                       Some overhead
Security                Moderate (shared kernel)          Strong (hardware isolation)
Use Cases               Microservices, CI/CD              Legacy apps, full isolation

Essential Docker Components

Images

Images are read-only templates for creating containers. Images are built from Dockerfiles or pulled from registries. They are layered (Union File System) for efficient storage and reuse. Images are immutable; to change, rebuild a new image.

Containers

Containers are runnable instances of images. They are isolated processes with their own filesystem, network, and PID space. Containers can be started, stopped, restarted, and removed. Data in a container is ephemeral unless stored in a volume.

Dockerfile

Text file with instructions to build an image. Instructions include FROM (base image), RUN (execute commands), COPY/ADD (files to image), EXPOSE (port), CMD/ENTRYPOINT (default command).

Sample Dockerfile (Node.js app):
# Base image
FROM node:18-alpine

# Set working directory
WORKDIR /app

# Copy package files
COPY package*.json ./

# Install dependencies
RUN npm ci --only=production

# Copy application code
COPY . .

# Expose port
EXPOSE 3000

# Set user (non-root)
USER node

# Start application
CMD ["node", "server.js"]

Volumes

Volumes persist data beyond container lifecycle, share data between containers, and provide storage for databases, uploads, logs. They can be named volumes (managed by Docker) or bind mounts (host directory).

Dockerfile vs Docker Compose:
Dockerfile (single container)      Docker Compose (multi-container)
─────────────────────────────────────────────────────────────────────────────
FROM node:18                        version: '3.8'
WORKDIR /app                        services:
COPY . .                             web:
RUN npm install                      build: .
CMD node app.js                       ports: "3000:3000"
                                     depends_on: db
                                    db:
                                      image: postgres
                                      environment:
                                        POSTGRES_PASSWORD: secret

Essential Docker Commands

Image management commands:
# List images
docker images

# Build image from Dockerfile
docker build -t myapp:latest .

# Pull image from registry
docker pull nginx:alpine

# Push image to registry
docker push myusername/myapp:latest

# Remove image
docker rmi myapp:latest

# Tag image
docker tag myapp:latest myapp:1.0.0

# Save image to tar file
docker save -o myapp.tar myapp:latest

# Load image from tar file
docker load -i myapp.tar
Container management commands:
# Run container
docker run -d -p 8080:80 --name webserver nginx

# List running containers
docker ps

# List all containers (including stopped)
docker ps -a

# Stop container
docker stop webserver

# Start stopped container
docker start webserver

# Restart container
docker restart webserver

# Remove container
docker rm webserver

# Remove container forcefully
docker rm -f webserver

# Execute command in running container
docker exec -it webserver /bin/bash

# View logs
docker logs webserver

# Follow logs (real-time)
docker logs -f webserver
Volume and network commands:
# Create volume
docker volume create mydata

# List volumes
docker volume ls

# Remove volume
docker volume rm mydata

# Run container with volume
docker run -v mydata:/data -d nginx

# Run with bind mount (host directory)
docker run -v /host/path:/container/path -d nginx

# Create network
docker network create mynet

# List networks
docker network ls

# Run container in network
docker run --network mynet --name app1 -d nginx

# Connect container to network
docker network connect mynet app2

Docker Anti-Patterns

  • Large Images (GB size): Using heavy base images (ubuntu instead of alpine). Installing unnecessary packages (build tools in production). Not cleaning package manager cache. Use minimal base images (alpine, slim), multi-stage builds, clean caches in same RUN layer.
  • Running as Root: Containers running with root user inside container. Security risk (container break could lead to host compromise). Create and switch to non-root user in Dockerfile.
  • Hardcoding Secrets in Images: API keys, passwords baked into image. Anyone with image access can extract secrets. Use environment variables or secrets management (Docker secrets, Kubernetes secrets).
  • No Health Checks: Container considered running even if app crashed. Orchestrator won't restart unhealthy containers. Add HEALTHCHECK instruction in Dockerfile.
  • Storing Data in Container (Ephemeral): Data lost when container restarts. Use volumes for persistent data. Use bind mounts for development (hot reload).
  • Using Latest Tag in Production: "latest" tag changes meaning over time. Non-reproducible deployments, difficult rollbacks. Use explicit version tags (1.0.0, git commit hash).
Docker security checklist:
Image Security:
□ Use minimal base image (alpine, distroless)
□ Run as non-root user (USER directive)
□ Keep base images updated (regular rebuilds)
□ Scan images for vulnerabilities (docker scan)

Configuration:
□ No secrets in Dockerfile (use env variables)
□ Set resource limits (--memory, --cpus)
□ Use read-only root filesystem (--read-only)
□ Add HEALTHCHECK instruction

Runtime:
□ Avoid privileged containers (--privileged)
□ Drop unnecessary capabilities (--cap-drop=ALL)
□ Use user namespace remapping
□ Enable seccomp / AppArmor profiles

Docker Best Practices

  • Use Small Base Images: Alpine Linux (~5MB) vs Ubuntu (~70MB). Distroless (Google, no package manager) for production. Specific language variants (node:alpine, python:alpine).
  • Multi-Stage Builds: Build dependencies in first stage, copy only artifacts to final stage. Reduces image size (no build tools, source code). Example: build Go binary in stage1, copy to alpine in stage2.
  • Leverage Layer Caching: Order Dockerfile layers from least to most frequently changing. Copy package.json before source code (caching npm install). Combine RUN commands that don't change often.
  • Resource Limits: Prevent containers from consuming all host resources. Set memory limit (--memory=512m). Set CPU limit (--cpus=0.5). Required in production (orchestrator enforces).
  • Health Checks: Define HEALTHCHECK in Dockerfile (curl or custom script). Orchestrator (Kubernetes) uses liveness/readiness probes. Container restarts when health check fails.
  • Manage Secrets Properly: Never use ENV for secrets (visible in docker inspect). Use Docker secrets in Swarm mode or Kubernetes secrets. Use external secrets manager (Vault, AWS Secrets Manager).
  • Volume Management: Use named volumes for production (managed by Docker). Use bind mounts only for development (hot reload). Backup volumes regularly.
Multi-stage build example (Go):
# Stage 1: Build
FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -o myapp .

# Stage 2: Production
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/myapp .
EXPOSE 8080
CMD ["./myapp"]

# Result: small image (no Go compiler, source code)
Layer ordering best practices:
# Optimized Dockerfile (caches dependencies)
FROM node:18-alpine

WORKDIR /app

# Copy package files first (rarely change)
COPY package*.json ./
RUN npm ci --only=production

# Copy source code last (frequently changes)
COPY . .

# Run as non-root
USER node

CMD ["node", "index.js"]

Frequently Asked Questions

  1. What is the difference between Docker image and container?
    Image is read-only template (like a class), container is running instance of image (like an object). Image is stored, container is executed. You can run many containers from same image.
  2. Can Docker run on Windows or macOS?
    Yes. Docker Desktop runs Linux containers on Windows (using Hyper-V/WSL2) and macOS (using HyperKit/VirtualBox). Native Windows containers also available (Windows Server). For Linux, Docker runs natively.
  3. What is the difference between CMD and ENTRYPOINT?
    CMD provides default command (can be overridden by docker run arguments). ENTRYPOINT sets base command (cannot be overridden, arguments appended). Often combined: ENTRYPOINT for executable, CMD for default arguments.
  4. How do I reduce Docker image size?
    Use Alpine base image, multi-stage builds, clean package manager cache, combine RUN commands, avoid installing build tools in final image, use .dockerignore to exclude unnecessary files.
  5. What is the difference between Docker and Kubernetes?
    Docker runs containers on single host. Kubernetes orchestrates containers across clusters of hosts. Docker is container runtime; Kubernetes is orchestration platform. They are complementary, not competing (Kubernetes can use Docker as runtime).
  6. What should I learn next after Docker basics?
    After mastering Docker basics, explore Docker Compose for multi-container apps, advanced containerization concepts, Kubernetes for orchestration, Dockerfile optimization techniques, and container security best practices.