Docker Data Management
This chapter will provide an in-depth explanation of Docker data management, including data volumes, bind mounts, temporary file systems, and data persistence strategies, helping you effectively manage data within containers.
Data Management Overview
Characteristics of Container Data
Docker containers are stateless by default - data inside containers is lost when containers are deleted. To achieve data persistence and sharing, Docker provides various data management solutions:
┌─────────────────────────────────────────────────────────┐
│ Host Filesystem │
├─────────────────────────────────────────────────────────┤
│ Data Volume │ Bind Mounts │ tmpfs Mounts │
│ (Volumes) │ (Bind Mounts) │ (tmpfs Mounts) │
│ │ │ │
│ Docker managed │ Host path │ Memory storage │
│ Good portability│ Good performance│ Temporary data │
│ High security │ Host dependent │ Deleted on stop│
└─────────────────────────────────────────────────────────┘Data Management Method Comparison
| Feature | Data Volumes | Bind Mounts | tmpfs Mounts |
|---|---|---|---|
| Management | Docker managed | User managed | System managed |
| Storage location | Docker directory | Any host path | Memory |
| Performance | Good | Best | Best |
| Portability | High | Low | N/A |
| Security | High | Medium | High |
| Persistence | Yes | Yes | No |
Data Volumes
Basic Volume Operations
bash
# Create data volume
docker volume create my-volume
# List all data volumes
docker volume ls
# View data volume details
docker volume inspect my-volume
# Delete data volume
docker volume rm my-volume
# Delete all unused data volumes
docker volume prune
# Force delete all data volumes
docker volume prune -fUsing Data Volumes
bash
# Use data volume in container
docker run -d --name web-server -v my-volume:/usr/share/nginx/html nginx
# Use anonymous data volume
docker run -d --name app -v /app/data nginx
# Multiple containers sharing data volume
docker run -d --name app1 -v shared-data:/data nginx
docker run -d --name app2 -v shared-data:/data nginx
# Read-only data volume
docker run -d --name app -v my-volume:/data:ro nginx
# Use data volume container pattern
docker create --name data-container -v /data busybox
docker run -d --volumes-from data-container --name app1 nginx
docker run -d --volumes-from data-container --name app2 nginxData Volume Drivers
bash
# Use local driver (default)
docker volume create --driver local my-local-volume
# Use NFS driver
docker volume create --driver local \
--opt type=nfs \
--opt o=addr=192.168.1.100,rw \
--opt device=:/path/to/nfs/share \
nfs-volume
# Use CIFS/SMB driver
docker volume create --driver local \
--opt type=cifs \
--opt o=username=user,password=pass,uid=1000,gid=1000 \
--opt device=//192.168.1.100/share \
cifs-volume
# View available drivers
docker info | grep "Volume:"Data Volume Configuration Options
bash
# Create labeled data volume
docker volume create --label environment=production --label team=backend my-volume
# Create data volume with driver options
docker volume create \
--driver local \
--opt type=none \
--opt o=bind \
--opt device=/host/path \
my-bind-volume
# Set data volume size limit (requires supported storage driver)
docker volume create \
--driver local \
--opt type=tmpfs \
--opt device=tmpfs \
--opt o=size=100m \
tmp-volumeBind Mounts
Basic Bind Mounts
bash
# Bind mount host directory
docker run -d --name web -v /host/path:/container/path nginx
# Use absolute path
docker run -d --name web -v $(pwd)/html:/usr/share/nginx/html nginx
# Read-only bind mount
docker run -d --name web -v /host/path:/container/path:ro nginx
# Bind mount single file
docker run -d --name app -v /host/config.json:/app/config.json nginx
# Use --mount syntax (recommended)
docker run -d --name web \
--mount type=bind,source=/host/path,target=/container/path \
nginxBind Mount Options
bash
# Read-only mount
docker run -d --name app \
--mount type=bind,source=/host/path,target=/container/path,readonly \
nginx
# Bind propagation settings
docker run -d --name app \
--mount type=bind,source=/host/path,target=/container/path,bind-propagation=shared \
nginx
# Consistency settings (macOS)
docker run -d --name app \
--mount type=bind,source=/host/path,target=/container/path,consistency=cached \
nginxDevelopment Environment Examples
bash
# Node.js development environment
docker run -it --rm \
--name node-dev \
-v $(pwd):/workspace \
-v node_modules:/workspace/node_modules \
-w /workspace \
-p 3000:3000 \
node:16 \
bash
# Python development environment
docker run -it --rm \
--name python-dev \
-v $(pwd):/app \
-w /app \
-p 8000:8000 \
python:3.9 \
bash
# Database development environment
docker run -d \
--name postgres-dev \
-v $(pwd)/data:/var/lib/postgresql/data \
-v $(pwd)/init.sql:/docker-entrypoint-initdb.d/init.sql \
-e POSTGRES_PASSWORD=password \
-p 5432:5432 \
postgres:13tmpfs Mounts
Basic tmpfs Usage
bash
# Create tmpfs mount
docker run -d --name app --tmpfs /tmp nginx
# Specify tmpfs options
docker run -d --name app \
--tmpfs /tmp:rw,size=100m,mode=1777 \
nginx
# Use --mount syntax
docker run -d --name app \
--mount type=tmpfs,destination=/tmp,tmpfs-size=100m \
nginx
# Multiple tmpfs mounts
docker run -d --name app \
--tmpfs /tmp \
--tmpfs /var/run \
nginxtmpfs Use Cases
bash
# Temporary file processing
docker run -d --name processor \
--tmpfs /tmp:size=1g \
--tmpfs /var/tmp:size=500m \
my-data-processor
# Cache directories
docker run -d --name web-app \
--tmpfs /app/cache:size=200m \
--tmpfs /app/sessions:size=100m \
my-web-app
# Sensitive data processing
docker run -d --name secure-app \
--tmpfs /secure:noexec,nosuid,size=50m \
my-secure-appData Management in Docker Compose
Data Volume Configuration
yaml
version: '3.8'
services:
web:
image: nginx
volumes:
# Named data volume
- web-content:/usr/share/nginx/html
# Bind mount
- ./nginx.conf:/etc/nginx/nginx.conf:ro
# Anonymous data volume
- /var/log/nginx
db:
image: postgres:13
volumes:
# Named data volume
- postgres-data:/var/lib/postgresql/data
# Initialization script
- ./init.sql:/docker-entrypoint-initdb.d/init.sql:ro
environment:
POSTGRES_PASSWORD: password
app:
image: myapp
volumes:
# Development code mount
- .:/app
# Prevent node_modules from being overwritten
- /app/node_modules
tmpfs:
# Temporary files
- /tmp
- /app/cache
# Define named data volumes
volumes:
web-content:
driver: local
postgres-data:
driver: local
driver_opts:
type: none
o: bind
device: /host/postgres/dataExternal Data Volumes
yaml
version: '3.8'
services:
app:
image: myapp
volumes:
- existing-volume:/data
volumes:
existing-volume:
external: true
# Or specify external data volume name
# external:
# name: my-existing-volumeData Volume Configuration Options
yaml
version: '3.8'
services:
app:
image: myapp
volumes:
# Long format configuration
- type: volume
source: app-data
target: /data
read_only: false
volume:
nocopy: true
# Bind mount long format
- type: bind
source: ./config
target: /app/config
read_only: true
bind:
propagation: shared
# tmpfs long format
- type: tmpfs
target: /tmp
tmpfs:
size: 100M
mode: 1777
volumes:
app-data:
driver: local
driver_opts:
type: nfs
o: addr=nfs-server,rw
device: ":/path/to/nfs/share"
labels:
- "environment=production"
- "backup=daily"Data Backup and Recovery
Data Volume Backup
bash
# Backup data volume to tar file
docker run --rm \
-v my-volume:/data \
-v $(pwd):/backup \
ubuntu \
tar czf /backup/backup-$(date +%Y%m%d-%H%M%S).tar.gz -C /data .
# Use dedicated backup container
docker run --rm \
-v my-volume:/source:ro \
-v $(pwd):/backup \
--name backup-container \
alpine \
sh -c "cd /source && tar czf /backup/volume-backup.tar.gz ."
# Backup to remote storage
docker run --rm \
-v my-volume:/data:ro \
-e AWS_ACCESS_KEY_ID=your-key \
-e AWS_SECRET_ACCESS_KEY=your-secret \
amazon/aws-cli \
s3 sync /data s3://your-bucket/backup/Data Volume Recovery
bash
# Restore from tar file
docker run --rm \
-v my-volume:/data \
-v $(pwd):/backup \
ubuntu \
tar xzf /backup/backup.tar.gz -C /data
# Copy from another data volume
docker run --rm \
-v source-volume:/source:ro \
-v target-volume:/target \
ubuntu \
cp -a /source/. /target/
# Restore from remote storage
docker run --rm \
-v my-volume:/data \
-e AWS_ACCESS_KEY_ID=your-key \
-e AWS_SECRET_ACCESS_KEY=your-secret \
amazon/aws-cli \
s3 sync s3://your-bucket/backup/ /dataAutomated Backup Scripts
bash
#!/bin/bash
# backup-volumes.sh
BACKUP_DIR="/backup"
DATE=$(date +%Y%m%d-%H%M%S)
# Get all data volumes
VOLUMES=$(docker volume ls -q)
for volume in $VOLUMES; do
echo "Backing up data volume: $volume"
# Create backup directory
mkdir -p "$BACKUP_DIR/$volume"
# Backup data volume
docker run --rm \
-v "$volume":/source:ro \
-v "$BACKUP_DIR/$volume":/backup \
alpine \
tar czf "/backup/$volume-$DATE.tar.gz" -C /source .
# Keep backups from last 7 days
find "$BACKUP_DIR/$volume" -name "*.tar.gz" -mtime +7 -delete
echo "Backup complete: $volume-$DATE.tar.gz"
done
echo "All data volumes backup complete"Data Synchronization and Migration
Container-to-Container Data Synchronization
bash
# Use rsync to synchronize data
docker run --rm \
-v source-volume:/source:ro \
-v target-volume:/target \
instrumentisto/rsync \
rsync -av --delete /source/ /target/
# Real-time synchronization (using inotify)
docker run -d \
-v source-volume:/source:ro \
-v target-volume:/target \
--name sync-container \
alpine \
sh -c "
apk add --no-cache inotify-tools rsync
while inotifywait -r -e modify,create,delete /source; do
rsync -av --delete /source/ /target/
done
"Cross-Host Data Migration
bash
# Export data volume
docker run --rm \
-v my-volume:/data:ro \
alpine \
tar czf - -C /data . > volume-export.tar.gz
# Transfer to target host
scp volume-export.tar.gz user@target-host:/tmp/
# Import on target host
docker volume create my-volume
docker run --rm \
-v my-volume:/data \
-i alpine \
tar xzf - -C /data < /tmp/volume-export.tar.gzPerformance Optimization
Storage Driver Optimization
json
// /etc/docker/daemon.json
{
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true",
"overlay2.size=20G"
]
}Data Volume Performance Tuning
bash
# Use local SSD storage
docker volume create \
--driver local \
--opt type=none \
--opt o=bind \
--opt device=/ssd/path \
fast-volume
# Memory file system
docker volume create \
--driver local \
--opt type=tmpfs \
--opt device=tmpfs \
--opt o=size=1G \
memory-volume
# Network storage optimization
docker volume create \
--driver local \
--opt type=nfs \
--opt o=addr=nfs-server,rw,tcp,hard,intr,timeo=600 \
--opt device=:/fast/nfs/share \
nfs-volumeMonitor Storage Usage
bash
# View Docker storage usage
docker system df
# Detailed view
docker system df -v
# View data volume usage
docker volume ls --format "table {{.Name}}\t{{.Driver}}\t{{.Scope}}"
# Monitoring script
#!/bin/bash
while true; do
echo "=== Docker Storage Usage ==="
docker system df
echo ""
echo "=== Data Volume List ==="
docker volume ls
echo ""
sleep 60
doneSecurity Considerations
Data Volume Security
bash
# Set data volume permissions
docker run --rm \
-v my-volume:/data \
alpine \
chown -R 1000:1000 /data
# Encrypt data volume
docker volume create \
--driver local \
--opt type=ext4 \
--opt o=loop,encryption=aes256 \
--opt device=/encrypted/volume/file \
encrypted-volume
# Read-only mount sensitive data
docker run -d \
-v /host/secrets:/secrets:ro \
--security-opt no-new-privileges \
myappAccess Control
bash
# Limit container user permissions
docker run -d \
--user 1000:1000 \
-v app-data:/data \
myapp
# Use SELinux labels
docker run -d \
--security-opt label:type:container_file_t \
-v /host/data:/data:Z \
myapp
# AppArmor configuration
docker run -d \
--security-opt apparmor:docker-default \
-v app-data:/data \
myappTroubleshooting
Common Issue Diagnosis
bash
# Check data volume mount
docker inspect container_name | grep -A 10 "Mounts"
# View data volume content
docker run --rm -v my-volume:/data alpine ls -la /data
# Check permission issues
docker run --rm -v my-volume:/data alpine \
sh -c "ls -la /data && id"
# Test read/write permissions
docker run --rm -v my-volume:/data alpine \
sh -c "echo 'test' > /data/test.txt && cat /data/test.txt"
# View storage driver information
docker info | grep -A 20 "Storage Driver"Performance Issue Troubleshooting
bash
# Monitor I/O performance
docker run --rm -v my-volume:/data alpine \
sh -c "dd if=/dev/zero of=/data/test bs=1M count=100 oflag=direct"
# Check disk space
docker run --rm -v my-volume:/data alpine df -h /data
# View inode usage
docker run --rm -v my-volume:/data alpine df -i /dataChapter Summary
This chapter comprehensively introduced various aspects of Docker data management:
Key Points:
- Data volumes: Docker-managed persistent storage, recommended for production
- Bind mounts: Direct host path mounting, suitable for development
- tmpfs mounts: Memory storage, suitable for temporary data
- Backup and recovery: Regularly backup important data, establish recovery strategies
- Performance optimization: Choose appropriate storage drivers and configurations
- Security considerations: Permission control and access restrictions
Best Practices:
- Prioritize data volumes in production environments
- Regularly backup important data
- Monitor storage usage
- Set reasonable permissions and security policies
- Choose appropriate storage drivers
In the next chapter, we will learn about Docker networking configuration, including network modes, custom networks, and service discovery.