Skip to main content

Data backup strategy

Ransomware hit on Friday night. By Monday morning, the file server was encrypted, and so was the external hard drive plugged into it. The Dropbox "backup" had helpfully synced the encrypted versions. The company paid the ransom. The decryption tool was buggy. Half the files came back corrupted anyway.

This scenario keeps repeating across small companies. Backups existed on paper but failed in practice.

This chapter makes sure that doesn't happen to you. You'll set up backups that actually work, test them before you need them, and create a policy that keeps everything protected.

Why backups fail when you need them

Most companies have some form of backup. Most of those backups would fail in a real disaster. Here's why:

Never tested — The backup runs every night. Nobody has ever tried to restore from it. When you actually need it, you discover the backups have been corrupted for six months.

Same failure domain — Backup drive attached to the server. Both get encrypted by ransomware. Backup in the same cloud region. Region goes down, backup is inaccessible too.

Incomplete scope — Server files are backed up. But nobody thought about SaaS data, databases, or configuration files. After recovery, half the systems don't work.

No documentation — The person who set up backups left the company two years ago. Nobody knows how to restore, what the passwords are, or where the backups actually live.

Too slow to recover — You have backups, but restoring 5TB takes 3 days. The business can't survive 3 days of downtime.

A backup that can't be restored isn't a backup. It's a false sense of security.

The 3-2-1 backup rule

The 3-2-1 rule has been the standard for decades because it works:

  • 3 copies of your data (the original + 2 backups)
  • 2 different types of storage media
  • 1 copy offsite (geographically separate)

Why each number matters

3 copies — Redundancy. If one backup fails, you have another. Drives fail, cloud services have outages, files get corrupted. Two backups means you can survive any single failure.

2 different media types — Protection against common-cause failures. If all your data is on the same type of drive from the same manufacturer, a firmware bug could kill them all simultaneously. Mix local drives with cloud storage, or SSDs with tape, or NAS with object storage.

1 offsite — Protection against physical disasters. Fire, flood, theft, or ransomware that spreads through your network. If your office burns down, your data survives because a copy exists elsewhere.

The modern 3-2-1-1-0 rule

Some organizations extend this to 3-2-1-1-0:

  • 1 copy that's air-gapped or immutable (can't be modified or deleted, even by admins)
  • 0 errors — backups are verified and tested

The extra "1" protects against ransomware that specifically targets backups. If attackers get admin access, they often delete backups before encrypting production data. An immutable or air-gapped copy survives even that.

Backup types: full, incremental, differential

Not every backup needs to copy everything every time.

Full backup — Complete copy of all data. Takes longest, uses most storage, but simplest to restore. Run weekly or monthly.

Incremental backup — Only data changed since the last backup (any type). Fast and small, but restoration requires the last full backup plus all incremental backups in sequence. If one incremental backup is corrupted, everything after it is lost.

Differential backup — Only data changed since the last full backup. Larger than incremental, but restoration only requires the last full plus the latest differential.

Which to use when

StrategyBackup timeStorage useRestore complexityBest for
Full onlySlowHighSimpleSmall datasets, weekly archives
Full + IncrementalFast dailyLowComplex (chain)Large datasets, daily backups
Full + DifferentialMediumMediumSimpleBalance of speed and reliability

Practical recommendation for small companies:

  • Full backup weekly (Sunday night)
  • Incremental or differential daily
  • Keep at least 4 weeks of full backups

Most modern tools (Restic, Borg, cloud backup services) handle this automatically with deduplication — you get the storage efficiency of incremental backups with the simplicity of full backups.

What to backup

Before configuring anything, inventory what actually needs backup.

Critical data categories

Business data

  • Customer databases
  • Financial records
  • Contracts and legal documents
  • Employee records
  • Intellectual property (source code, designs, documentation)

System configurations

  • Server configurations
  • Network device configs
  • Application settings
  • Infrastructure as Code files
  • SSL certificates and keys

SaaS data

  • Google Workspace (emails, docs, drive)
  • Microsoft 365
  • Slack/Teams history
  • CRM data (Salesforce, HubSpot)
  • Project management (Asana, Jira, Notion)

Databases

  • Production databases
  • Application databases
  • Analytics data

Secrets and credentials

  • Password manager exports (Passwork supports encrypted exports — schedule them monthly)
  • API keys (stored securely in password manager like Passwork)
  • Encryption keys
  • SSH keys
  • Backup encryption keys themselves (store separately — in Passwork or offline)

What doesn't need backup

  • Temporary files and caches
  • Easily re-downloadable software
  • Data that's already stored elsewhere (mirrors, replicas)
  • Log files older than retention requirements

Focus backup resources on data that's irreplaceable or expensive to recreate.

Data classification for backup priority

CategoryExamplesBackup frequencyRetentionRecovery priority
CriticalCustomer data, financial, production DBContinuous/hourly1+ yearImmediate
ImportantSource code, configs, documentsDaily90 daysWithin hours
StandardInternal docs, project filesDaily/weekly30 daysWithin 24h
LowArchives, old projectsWeekly/monthly30 daysBest effort

Backup methods and tools

Different data needs different backup approaches.

Local/on-premise servers

For Linux servers:

Restic — Fast, encrypted, deduplicated backups. Supports local, S3, SFTP, and many other backends.

# Install
apt install restic

# Initialize repository (to S3)
restic init -r s3:s3.amazonaws.com/your-backup-bucket

# Backup a directory
restic backup /var/www /etc /home

# List snapshots
restic snapshots

# Restore
restic restore latest --target /restore/path

BorgBackup — Similar to Restic, excellent deduplication, slightly more complex.

# Initialize
borg init --encryption=repokey /path/to/backup/repo

# Backup
borg create /path/to/repo::backup-{now} /home /etc

# Restore
borg extract /path/to/repo::backup-name

For databases:

PostgreSQL:

# Automated daily backup
pg_dump -h localhost -U postgres mydatabase | gzip > /backup/mydb_$(date +%Y%m%d).sql.gz

# With pg_basebackup for point-in-time recovery
pg_basebackup -h localhost -D /backup/base -Ft -z -P

MySQL/MariaDB:

# Dump all databases
mysqldump --all-databases --single-transaction | gzip > /backup/all_$(date +%Y%m%d).sql.gz

# Or use Percona XtraBackup for hot backups
xtrabackup --backup --target-dir=/backup/xtrabackup/

MongoDB:

mongodump --out /backup/mongo/$(date +%Y%m%d)

Cloud infrastructure

AWS:

  • S3 versioning — Enable versioning on critical buckets. Protects against accidental deletion and ransomware.
  • AWS Backup — Managed backup for EC2, RDS, EFS, DynamoDB. Centralized policies and retention.
  • RDS automated backups — Enable with appropriate retention (default 7 days, can extend to 35).
  • EBS snapshots — Schedule regular snapshots for EC2 volumes.
# Enable S3 versioning
aws s3api put-bucket-versioning --bucket my-bucket --versioning-configuration Status=Enabled

# Create EBS snapshot
aws ec2 create-snapshot --volume-id vol-1234567890 --description "Daily backup"

GCP:

  • Cloud Storage versioning — Similar to S3
  • Persistent Disk snapshots — Schedule via snapshot schedules
  • Cloud SQL automated backups — Enabled by default, configure retention

Azure:

  • Azure Backup — Managed service for VMs, SQL, file shares
  • Blob Storage versioning
  • Azure Site Recovery — For disaster recovery

SaaS data backup

Your cloud apps have your data, but they don't guarantee backups the way you need.

Google Workspace:

Google's built-in retention is limited. Use third-party backup:

Or use Google Vault for retention (included in some plans, but not a true backup).

Microsoft 365:

Similar situation — Microsoft's retention isn't backup:

Slack:

Slack's data export has limitations (no DMs on free/pro plans). Consider:

  • Enterprise Grid includes eDiscovery and export
  • Third-party: Rewind, Backupify

General SaaS:

Many smaller SaaS apps don't have backup integrations. For these:

  • Export data manually on a schedule
  • Use Zapier/Make to automate exports where possible
  • Document what data lives where and how to export it

Container and Kubernetes backups

Velero — The standard for Kubernetes backup:

# Install Velero
velero install --provider aws --bucket my-backup-bucket --secret-file ./credentials

# Backup namespace
velero backup create my-backup --include-namespaces my-namespace

# Schedule regular backups
velero schedule create daily-backup --schedule="0 2 * * *" --include-namespaces production

# Restore
velero restore create --from-backup my-backup

Docker volumes:

# Backup a volume
docker run --rm -v my_volume:/source -v /backup:/backup alpine tar czf /backup/my_volume.tar.gz -C /source .

# Restore
docker run --rm -v my_volume:/target -v /backup:/backup alpine tar xzf /backup/my_volume.tar.gz -C /target

Immutable and air-gapped backups

Ransomware operators specifically target backups. If they can delete your backups, you have to pay. Immutable and air-gapped backups protect against this.

Immutable storage

Immutable storage prevents modification or deletion for a set period, even by administrators.

AWS S3 Object Lock:

# Enable Object Lock when creating bucket
aws s3api create-bucket --bucket my-backup-bucket --object-lock-enabled-for-bucket

# Set default retention
aws s3api put-object-lock-configuration --bucket my-backup-bucket --object-lock-configuration '{
"ObjectLockEnabled": "Enabled",
"Rule": {
"DefaultRetention": {
"Mode": "GOVERNANCE",
"Days": 30
}
}
}'

Governance mode allows override with special permissions. Compliance mode cannot be overridden by anyone, including root.

Azure Immutable Blob Storage: Time-based retention or legal hold policies.

Backblaze B2: Object Lock support, very cost-effective.

Air-gapped backups

Air-gapped means physically or logically disconnected from your network.

Physical air gap:

  • External drives that are only connected during backup
  • Tape backups stored offsite
  • Offline storage rotated weekly/monthly

Logical air gap:

  • Separate cloud account with no network path from production
  • Backups pulled (not pushed) by a system with different credentials
  • Write-only access — backup service can write but not delete
Production Environment


[Primary Backup] ──────► AWS S3 (same account)
│ Daily, 30-day retention


[Secondary Backup] ────► AWS S3 (different account)
│ Object Lock enabled
│ Cross-account, no delete permissions


[Tertiary Backup] ─────► Offline/Air-gapped
Weekly to external drive
Stored offsite

Backup automation

Manual backups don't happen. Automate everything.

Scheduling with cron

# /etc/cron.d/backup

# Daily database backup at 2 AM
0 2 * * * root /usr/local/bin/backup-database.sh >> /var/log/backup.log 2>&1

# Weekly full system backup at 3 AM Sunday
0 3 * * 0 root /usr/local/bin/backup-full.sh >> /var/log/backup.log 2>&1

# Monthly offsite sync at 4 AM on the 1st
0 4 1 * * root /usr/local/bin/backup-offsite.sh >> /var/log/backup.log 2>&1

Sample backup script

#!/bin/bash
# /usr/local/bin/backup-database.sh

set -e

BACKUP_DIR="/backup/database"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30

# Create backup
pg_dump -h localhost -U postgres mydb | gzip > "$BACKUP_DIR/mydb_$DATE.sql.gz"

# Verify backup is not empty
if [ ! -s "$BACKUP_DIR/mydb_$DATE.sql.gz" ]; then
echo "ERROR: Backup file is empty" | mail -s "Backup Failed" [email protected]
exit 1
fi

# Upload to S3
aws s3 cp "$BACKUP_DIR/mydb_$DATE.sql.gz" s3://my-backup-bucket/database/

# Clean old local backups
find "$BACKUP_DIR" -name "*.sql.gz" -mtime +$RETENTION_DAYS -delete

# Log success
echo "$(date): Backup completed successfully"

Monitoring backups

Backups fail silently. Monitor them:

Check for completion:

# Alert if no backup file created in last 25 hours
find /backup -name "*.gz" -mtime -1 | grep -q . || alert "Backup missing"

Cloud monitoring:

  • AWS CloudWatch for AWS Backup
  • Third-party: Healthchecks.io — Free for simple monitoring
  • Prometheus/Grafana for metrics

Simple monitoring with healthchecks.io:

# Add to end of backup script
curl -fsS -m 10 --retry 5 https://hc-ping.com/your-uuid-here

If the ping doesn't arrive on schedule, you get alerted.

Testing recovery

The backup isn't complete until you've restored from it.

Why testing matters

  • Backup files can be corrupted
  • Restore process might have changed
  • You might be missing critical files
  • The person who knows how to restore might not be available

Recovery testing schedule

Test typeFrequencyWhat to test
File restoreMonthlyRestore random files from backup
Database restoreQuarterlyRestore to test environment, verify data
Full system restoreAnnuallyRestore entire server/service from scratch
Disaster recovery drillAnnuallySimulate complete outage, time full recovery

How to test a database restore

  1. Create test environment — Don't restore to production!

  2. Restore the backup:

# PostgreSQL
gunzip -c backup.sql.gz | psql -h test-server -U postgres -d testdb

# MySQL
gunzip -c backup.sql.gz | mysql -h test-server -u root -p testdb
  1. Verify data integrity:
-- Check row counts
SELECT COUNT(*) FROM important_table;

-- Check for recent data
SELECT MAX(created_at) FROM transactions;

-- Run application smoke tests
  1. Document results:
  • Time taken to restore
  • Any errors encountered
  • Data verification results
  • Improvement needed

Full disaster recovery test

Once a year, simulate a complete disaster:

  1. Pick a non-critical system or use a test environment
  2. Pretend the primary is destroyed — Don't access it at all
  3. Restore from backups only
  4. Measure:
    • Recovery Time Objective (RTO): How long until service is restored?
    • Recovery Point Objective (RPO): How much data was lost?
  5. Document everything: What worked, what was hard, what was missing

This reveals gaps you won't find any other way.

Compliance and regulatory requirements

If you handle personal data or work with enterprise customers, backups aren't just good practice — they're often legally required.

GDPR

GDPR doesn't say "you must have backups" explicitly, but Article 32 requires:

"the ability to restore the availability and access to personal data in a timely manner in the event of a physical or technical incident"

This means:

  • You need backups of any system containing personal data
  • Backups must be encrypted (Article 32 requires "encryption of personal data")
  • Backup locations matter — storing backups outside EU/EEA requires appropriate transfer mechanisms (Standard Contractual Clauses, adequacy decisions)
  • Retention periods apply to backups too — you can't keep backup data forever
  • Access controls — backup access should be limited and logged

The right to erasure problem: Article 17 gives people the right to have their data deleted. But deleting specific records from backups is technically difficult. The accepted approach:

  1. Delete from production systems immediately
  2. Document that the data exists in backups with a specific expiration date
  3. When backup expires or is restored, apply deletion again
  4. Don't restore deleted data to production

Documentation required:

  • Backup retention periods
  • Encryption methods used
  • Backup location (country/region)
  • Access log retention

ISO 27001

ISO 27001 Annex A has specific controls for backup:

A.12.3 Information backup:

  • A.12.3.1 requires backup copies of information, software, and system images to be taken and tested regularly

What "tested regularly" means:

  • Define testing frequency in your policy (monthly minimum recommended)
  • Document test results
  • Include backup verification in your ISMS scope

Additional controls that affect backups:

  • A.8.2 (Information classification) — backup retention should match data classification
  • A.11.1 (Secure areas) — physical protection of backup media and storage
  • A.12.4 (Logging and monitoring) — backup access should be logged
  • A.18.1.3 (Protection of records) — some records have legally mandated retention

SOC 2

SOC 2 Trust Services Criteria include:

Availability (A1.2):

  • Recovery objectives (RTO/RPO) must be documented and tested
  • Backup procedures must be documented
  • Recovery testing must occur regularly

Common Controls:

  • CC6.1 — Logical access to backup systems must be controlled
  • CC7.2 — System changes (including backup configuration) must be managed
  • CC7.3 — Backup integrity must be verified

For SOC 2 audits, you'll need to demonstrate:

  • Documented backup policy
  • Evidence of backup execution (logs)
  • Evidence of recovery testing (test results with dates)
  • Access controls on backup systems

PCI-DSS (if you handle payment data)

PCI-DSS 4.0 requirements:

  • Requirement 9.4.1: Backups containing cardholder data must be stored securely
  • Requirement 3.1: Storage of cardholder data should be minimized — includes backup retention
  • Requirement 12.10.1: Incident response plan must address backup procedures

Practical compliance checklist

RequirementGDPRISO 27001SOC 2Your status
Documented backup policy
Encryption at rest
Encryption in transit
Defined retention periods
Regular testing documentedImplied
Access controls and logging
Backup location documented
RTO/RPO definedImplied

The backup policy document

Create a written policy so everyone knows the plan.

Backup policy template


[Company Name] Data Backup Policy

Purpose: Ensure critical data can be recovered in case of hardware failure, ransomware, accidental deletion, or disaster.

Scope: All company data including servers, databases, SaaS applications, and employee workstations.

Backup Schedule:

Data TypeFrequencyRetentionLocation
Production databasesHourly30 daysAWS S3 + cross-account
File serversDaily90 daysBackblaze B2
SaaS (Google Workspace)Daily1 yearBackupify
System configurationsOn change + weekly90 daysGit + S3
Employee laptopsContinuous30 days[Backup service]

3-2-1 Implementation:

  • Copy 1: Production environment
  • Copy 2: Primary cloud backup (AWS S3)
  • Copy 3: Secondary cloud backup (Backblaze B2, different provider)
  • Offsite: All cloud backups are geographically distributed
  • Immutable: S3 Object Lock enabled on secondary backup

Testing:

  • Monthly: Random file restore verification
  • Quarterly: Full database restore to test environment
  • Annually: Complete disaster recovery drill

Recovery Objectives:

  • RTO (Recovery Time): 4 hours for critical systems, 24 hours for standard
  • RPO (Recovery Point): 1 hour for databases, 24 hours for files

Responsibilities:

  • Backup configuration and monitoring: [Name/Role]
  • Recovery testing: [Name/Role]
  • Policy review: [Name/Role], annually
  • Private key holder (restore authority): [Name/Role]
  • Emergency access holder: [Name/Role]

Access Control:

  • Backup access register location: [Link/location]
  • Access review frequency: Quarterly
  • Authorized restore personnel: [List names]

Recovery Procedure:

  1. Assess what needs to be recovered
  2. Access backup storage credentials from Passwork
  3. Follow runbook in [location]
  4. Verify recovery completeness
  5. Document incident

Compliance:

  • Personal data included: Yes/No
  • Backup encryption: AES-256 (Restic)
  • Encryption key location: Passwork vault "Backup Keys"
  • Storage regions: [EU/US/other — document for GDPR]
  • Right to erasure handling: Apply deletion on restore, backups expire after [X] days

Last Updated: [Date] Next Review: [Date + 1 year]


Common backup mistakes

Mistake 1: Backing up to the same system

If ransomware encrypts your server and the attached USB drive, both are gone. Cloud sync folders (Dropbox, OneDrive, Google Drive) are not backup — they sync the encrypted files too.

Fix: Backups must be on separate systems with separate credentials.

Mistake 2: No offsite copy

Fire, flood, theft, or a disgruntled employee with admin access can destroy on-premise backups.

Fix: At least one copy in a different physical location or cloud region.

Mistake 3: Never testing restores

You find out the backup doesn't work when you desperately need it.

Fix: Scheduled restore testing, documented results.

Mistake 4: Backing up the wrong things

You back up the file server but forget about the database, the SaaS apps, or the configuration files that make everything work.

Fix: Complete inventory of what needs backup before configuring anything.

Mistake 5: No monitoring

Backup runs fail silently. Nobody notices for months.

Fix: Automated monitoring and alerting for backup jobs.

Mistake 6: Credentials in one place

The backup encryption password is on the server that got encrypted.

Fix: Store backup credentials in password manager (Passwork), accessible by multiple authorized people.

Mistake 7: Losing the encryption key

Encrypted backups with a lost key are worthless. The person who set up backups left, and nobody knows the passphrase.

Fix: Store encryption keys in Passwork, create a paper backup in a secure location, and test recovery with those keys annually.

Mistake 8: Not accounting for GDPR deletion requests

Someone requests data deletion under GDPR. You delete from production, but their data lives on in 30+ backup copies.

Fix: Document your approach: data deleted from production immediately, deletion applied to any restored backups, backups expire within [X] days.

Backup access control

Who can access your backups?

Backups often contain more sensitive data than production systems — they're a complete snapshot of everything. Yet backup access is often an afterthought.

Maintain a backup access register:

PersonRoleAccess levelGranted dateLast reviewed
[Name]Security ChampionFull (encrypt/decrypt/restore)2024-01-152024-06-01
[Name]DevOps LeadRead/restore only2024-02-012024-06-01
[Name]CTOEmergency access (sealed credentials)2024-01-152024-06-01

Rules for backup access:

  1. Minimum necessary — Most people don't need backup access. Developers rarely need it.
  2. Separate from production — Backup credentials should be different from production credentials
  3. Review quarterly — Remove access for people who left or changed roles
  4. Log access — Know who accessed backups and when
  5. Two-person rule for critical data — Consider requiring two people to restore the most sensitive backups

What counts as critical data in backups?

Some data requires extra protection:

  • Customer personal data (GDPR scope)
  • Financial records and payment data
  • Authentication credentials and secrets
  • Encryption keys
  • Source code (intellectual property)
  • Health records (HIPAA scope)
  • Employee personal information

If your backups contain any of these, consider the asymmetric encryption approach below.

Hybrid encryption: Security Champion controls restoration

A practical setup where only the Security Champion (or designated backup admin) can decrypt backups. Uses hybrid encryption: fast AES-256 for the data, RSA for key protection.

Backup creation (anyone with backup script access can run this):

  1. Generate a random AES-256 key
  2. Encrypt backup data with that AES key
  3. Encrypt the AES key with the RSA public key
  4. Store both: encrypted_backup + encrypted_key

Restoration (only the Security Champion):

  1. Decrypt the AES key using the RSA private key
  2. Decrypt the backup data with the AES key
  3. Restore

Why this pattern:

  • Backup scripts can run automatically (only need public key)
  • Only the private key holder can restore
  • If backup storage is compromised, data is still encrypted
  • AES-256 is fast for large files; RSA handles key exchange

Step-by-step with OpenSSL

1. Generate RSA key pair (Security Champion does this once):

# Generate 4096-bit RSA private key
openssl genrsa -aes256 -out backup_private.pem 4096

# Extract public key
openssl rsa -in backup_private.pem -pubout -out backup_public.pem

# Private key: store securely (Passwork, offline, safety deposit box)
# Public key: distribute to backup servers

2. Backup script (runs automatically, uses only public key):

#!/bin/bash
# backup-encrypt.sh

BACKUP_FILE=$1
PUBLIC_KEY="/etc/backup/backup_public.pem"
OUTPUT_DIR="/backup/encrypted"
DATE=$(date +%Y%m%d_%H%M%S)

# Generate random 256-bit AES key
AES_KEY=$(openssl rand -hex 32)

# Encrypt backup with AES-256-CBC
openssl enc -aes-256-cbc -salt -pbkdf2 \
-in "$BACKUP_FILE" \
-out "${OUTPUT_DIR}/backup_${DATE}.enc" \
-pass pass:"$AES_KEY"

# Encrypt AES key with RSA public key
echo "$AES_KEY" | openssl rsautl -encrypt -pubin \
-inkey "$PUBLIC_KEY" \
-out "${OUTPUT_DIR}/backup_${DATE}.key.enc"

# Clean up
unset AES_KEY
rm "$BACKUP_FILE" # Remove unencrypted backup

echo "Backup encrypted: backup_${DATE}.enc + backup_${DATE}.key.enc"

3. Restore script (Security Champion only, requires private key):

#!/bin/bash
# backup-decrypt.sh

ENCRYPTED_BACKUP=$1
ENCRYPTED_KEY="${ENCRYPTED_BACKUP%.enc}.key.enc"
PRIVATE_KEY="/secure/backup_private.pem"
OUTPUT_FILE="${ENCRYPTED_BACKUP%.enc}.restored"

# Decrypt AES key using RSA private key
AES_KEY=$(openssl rsautl -decrypt \
-inkey "$PRIVATE_KEY" \
-in "$ENCRYPTED_KEY")

# Decrypt backup with AES key
openssl enc -aes-256-cbc -d -salt -pbkdf2 \
-in "$ENCRYPTED_BACKUP" \
-out "$OUTPUT_FILE" \
-pass pass:"$AES_KEY"

# Clean up
unset AES_KEY

echo "Backup decrypted: $OUTPUT_FILE"

4. Complete example — database backup with encryption:

#!/bin/bash
# full-db-backup.sh

DB_NAME="production"
BACKUP_DIR="/backup"
PUBLIC_KEY="/etc/backup/backup_public.pem"
DATE=$(date +%Y%m%d_%H%M%S)

# Dump database
pg_dump -h localhost -U postgres "$DB_NAME" | gzip > "${BACKUP_DIR}/db_${DATE}.sql.gz"

# Generate AES key and encrypt
AES_KEY=$(openssl rand -hex 32)

openssl enc -aes-256-cbc -salt -pbkdf2 \
-in "${BACKUP_DIR}/db_${DATE}.sql.gz" \
-out "${BACKUP_DIR}/db_${DATE}.sql.gz.enc" \
-pass pass:"$AES_KEY"

echo "$AES_KEY" | openssl rsautl -encrypt -pubin \
-inkey "$PUBLIC_KEY" \
-out "${BACKUP_DIR}/db_${DATE}.key.enc"

# Remove unencrypted file
rm "${BACKUP_DIR}/db_${DATE}.sql.gz"

# Upload to S3 (encrypted files only)
aws s3 cp "${BACKUP_DIR}/db_${DATE}.sql.gz.enc" s3://backup-bucket/db/
aws s3 cp "${BACKUP_DIR}/db_${DATE}.key.enc" s3://backup-bucket/db/

# Cleanup and notify
unset AES_KEY
echo "$(date): Database backup encrypted and uploaded" >> /var/log/backup.log

Key management for this setup

Private key storage:

  • Primary: Passwork (encrypted, in a vault only Security Champion can access)
  • Backup: Printed on paper, stored in company safe or safety deposit box
  • Never on backup servers or in backup storage

Public key distribution:

  • Can be stored on backup servers (it's public)
  • Include in backup scripts
  • Version control is fine (it's not secret)

Key rotation:

  • Generate new key pair annually
  • Keep old private keys to decrypt old backups
  • Document which key encrypts which backup range

Emergency access:

  • CTO or second person should have sealed copy of private key
  • "Break glass" procedure documented
  • Test emergency access annually

Practical tips that save money and pain

Verify backup integrity without full restore

Full restores take time. But you can verify backups are readable:

# Restic: check backup integrity
restic check

# Restic: check and read all data (slower, but catches more problems)
restic check --read-data

# Borg: verify repository
borg check /path/to/repo

# Check backup file isn't empty or corrupted
gzip -t backup.sql.gz && echo "OK" || echo "CORRUPTED"

Schedule integrity checks weekly. They're faster than full restores but catch most corruption.

Encryption key management

If you lose your backup encryption key, your backups are useless. This happens more often than you'd think.

Do this:

  1. Store encryption keys in Passwork (separate from the backup itself)
  2. Create a paper copy stored in a safe or safety deposit box
  3. Make sure at least two people know where keys are stored
  4. Test that keys work by restoring to a fresh system

Don't:

  • Store the encryption key on the same server that's being backed up
  • Have only one person who knows the key
  • Use a password you might forget

Backup window optimization

Backups impact production systems. Minimize the pain:

Schedule smart:

# Run backup during low-traffic hours
0 3 * * * /usr/local/bin/backup.sh # 3 AM

# For global teams, find the quietest hour (check your analytics)

Throttle bandwidth:

# Restic: limit bandwidth to 50 MB/s
restic backup --limit-upload 50000 /data

# rsync: limit bandwidth
rsync --bwlimit=50000 /source /destination

# AWS S3: configure transfer acceleration or use multipart uploads

Use database-native tools:

  • PostgreSQL: pg_dump with --no-synchronized-snapshots for less locking
  • MySQL: --single-transaction for InnoDB tables
  • MongoDB: replica set backups from secondary

Deduplication to save storage costs

Deduplication stores each unique chunk of data once, even if it appears in multiple backups.

Tools with built-in deduplication:

  • Restic, BorgBackup — excellent dedup
  • AWS S3 — no native dedup, but Glacier saves money for archives
  • Backblaze B2 — no dedup, but cheap enough it doesn't matter

Real savings: A 100GB dataset with daily backups for 30 days:

  • Without dedup: 3TB storage
  • With dedup (typical 90% reduction): 300GB storage

Backup your password manager

Your password manager (Passwork) contains keys to everything. If you lose it, recovery becomes much harder.

For Passwork:

  • Use built-in encrypted export feature
  • Schedule monthly exports
  • Store exports in your backup system (encrypted separately)
  • Self-hosted Passwork: backup the database and file storage as part of your infrastructure backup

Recovery scenario: If your infrastructure is encrypted and you can't access Passwork, you need the Passwork backup to get credentials for your other backups. Store at least one Passwork export in a completely separate location.

Grandfather-father-son rotation

For long-term retention without endless storage:

  • Daily (son): Keep 7 daily backups
  • Weekly (father): Keep 4 weekly backups (every Sunday)
  • Monthly (grandfather): Keep 12 monthly backups (first of month)

This gives you a year of monthly restore points, a month of weekly points, and a week of daily points — with only 23 backup copies instead of 365.

Most backup tools support this with retention policies:

# Restic: keep 7 daily, 4 weekly, 12 monthly
restic forget --keep-daily 7 --keep-weekly 4 --keep-monthly 12 --prune

# BorgBackup
borg prune --keep-daily=7 --keep-weekly=4 --keep-monthly=12

Cross-region and cross-provider resilience

One cloud provider can have outages. For critical data:

Cross-region (same provider):

# AWS: replicate S3 bucket to another region
aws s3api put-bucket-replication --bucket source-bucket --replication-configuration file://replication.json

Cross-provider:

  • Primary backup: AWS S3
  • Secondary backup: Backblaze B2 or Google Cloud Storage
  • Use rclone to sync between providers
# rclone: sync S3 to Backblaze B2
rclone sync s3:my-backup-bucket b2:my-backup-bucket --transfers 8

Why both: If AWS has a major outage, you can still access backups. If a provider has a security breach, your data is elsewhere.

Quick size sanity check

Add this to backup scripts to catch empty or suspiciously small backups:

#!/bin/bash
BACKUP_FILE=$1
MIN_SIZE_KB=1000 # Minimum expected size in KB

SIZE=$(du -k "$BACKUP_FILE" | cut -f1)

if [ "$SIZE" -lt "$MIN_SIZE_KB" ]; then
echo "WARNING: Backup file suspiciously small ($SIZE KB)" | mail -s "Backup Size Alert" [email protected]
exit 1
fi

Recovery procedures

Document these before you need them.

File recovery procedure

  1. Identify what needs to be recovered (file name, path, approximate date)
  2. Access backup storage (credentials in Passwork: "Backup - S3" entry)
  3. List available snapshots: restic snapshots
  4. Find the file: restic find filename
  5. Restore: restic restore snapshot_id --target /restore --include path/to/file
  6. Verify the restored file
  7. Move to production location if correct

Database recovery procedure

  1. Assess the damage (corrupted data, accidental deletion, complete loss)
  2. Determine recovery point (what time should we restore to?)
  3. Stop application access to prevent further writes
  4. Access backup credentials from Passwork
  5. Download backup file
  6. Restore to test environment first if time permits
  7. Restore to production
  8. Verify data integrity
  9. Resume application access
  10. Document what happened and what was lost

Full server recovery procedure

  1. Provision new server (same specs as failed server)
  2. Install base OS
  3. Retrieve server configuration from backup
  4. Restore configuration files
  5. Restore application data
  6. Restore database
  7. Verify all services are running
  8. Update DNS if needed
  9. Test thoroughly before declaring recovery complete

Tools and services

Backup software

Open source:

  • Restic — Fast, encrypted, deduplicated
  • BorgBackup — Excellent deduplication
  • Duplicati — GUI-based, good for workstations
  • Velero — Kubernetes backups

Commercial:

  • Veeam — Industry leader, expensive
  • Acronis — Good all-in-one
  • Druva — Cloud-native

Backup storage

Cloud object storage:

  • AWS S3 — Standard choice, Glacier for archives
  • Backblaze B2 — Much cheaper than S3, S3-compatible API
  • Wasabi — Cheap, no egress fees
  • Google Cloud Storage — Good if you're on GCP

Backup-specific services:

Monitoring

  • Healthchecks.io — Dead man's switch for cron jobs
  • Cronitor — Similar, more features
  • Cloud-native: AWS CloudWatch, GCP Monitoring, Azure Monitor

Workshop: implement your backup strategy

Block 3-4 hours for initial setup, then ongoing maintenance.

Part 1: Inventory and classification (45 minutes)

  1. List all data that needs backup
  2. Classify by priority (critical, important, standard, low)
  3. Identify current backup gaps
  4. Document where each type of data lives

Part 2: Set up primary backup (60 minutes)

  1. Choose backup tool (Restic, Borg, or cloud-native)
  2. Configure for your critical data
  3. Set up automated schedule
  4. Verify first backup completes successfully
  5. Document the configuration

Part 3: Set up offsite/immutable backup (45 minutes)

  1. Create secondary backup destination (different provider/account)
  2. Enable immutability if available
  3. Configure replication from primary
  4. Verify backup reaches secondary location

Part 4: Test recovery (30 minutes)

  1. Pick a random file from backup
  2. Restore it to a test location
  3. Verify contents match original
  4. Document the restore process

Part 5: Documentation (30 minutes)

  1. Write your backup policy (use template above)
  2. Document recovery procedures
  3. Store credentials in password manager
  4. Schedule recurring restore tests

Deliverables:

  • Complete data inventory
  • Automated backups running for critical data
  • Offsite/immutable backup configured
  • First recovery test completed
  • Backup policy documented
  • Recovery procedures documented
  • Monitoring configured

Talking to leadership

If someone asks why you spent time on this:

"I implemented a comprehensive backup strategy following the 3-2-1 rule — three copies of our data, two different storage types, one offsite. I also enabled immutable storage so ransomware can't delete our backups even if attackers get admin access. And I tested recovery to make sure it actually works. If our servers were encrypted tomorrow, we could recover critical systems within 4 hours."

Short version: "I set up proper backups and tested that we can actually recover from them. Ransomware can't take us down."

Self-check: did you actually do it?

Backup coverage

  • Inventoried all data that needs backup
  • Classified data by priority
  • Production databases are backed up
  • File servers/storage are backed up
  • Critical SaaS data has backup solution
  • System configurations are backed up

3-2-1 implementation

  • At least 3 copies of critical data exist
  • Backups are on different storage types/providers
  • At least one copy is offsite/cloud
  • At least one copy is immutable or air-gapped

Automation and monitoring

  • Backups run automatically on schedule
  • Monitoring alerts if backup fails
  • Backup logs are retained

Testing and documentation

  • Completed at least one test restore
  • Backup policy documented
  • Recovery procedures documented
  • Credentials stored in password manager
  • Recovery test schedule created

Security and compliance

  • Backups are encrypted (at rest and in transit)
  • Encryption keys stored separately from backups
  • Backup location documented (for GDPR)
  • Retention periods defined
  • At least two people can access backup credentials

Access control

  • Backup access register maintained (who can access what)
  • Backup credentials separate from production credentials
  • Critical data backups use asymmetric encryption (RSA + AES)
  • Private key stored securely (Passwork + offline backup)
  • Emergency access procedure documented

If you can check off at least 18 of these 26 items, you're ready to move on.

What's next

Backups are solid. One more quick win before moving on to development security.

Next chapter: website protection with Cloudflare — DDoS protection, WAF, and bot mitigation for your public-facing sites.