Data backup strategy
Ransomware hit on Friday night. By Monday morning, the file server was encrypted, and so was the external hard drive plugged into it. The Dropbox "backup" had helpfully synced the encrypted versions. The company paid the ransom. The decryption tool was buggy. Half the files came back corrupted anyway.
This scenario keeps repeating across small companies. Backups existed on paper but failed in practice.
This chapter makes sure that doesn't happen to you. You'll set up backups that actually work, test them before you need them, and create a policy that keeps everything protected.
Why backups fail when you need them
Most companies have some form of backup. Most of those backups would fail in a real disaster. Here's why:
Never tested — The backup runs every night. Nobody has ever tried to restore from it. When you actually need it, you discover the backups have been corrupted for six months.
Same failure domain — Backup drive attached to the server. Both get encrypted by ransomware. Backup in the same cloud region. Region goes down, backup is inaccessible too.
Incomplete scope — Server files are backed up. But nobody thought about SaaS data, databases, or configuration files. After recovery, half the systems don't work.
No documentation — The person who set up backups left the company two years ago. Nobody knows how to restore, what the passwords are, or where the backups actually live.
Too slow to recover — You have backups, but restoring 5TB takes 3 days. The business can't survive 3 days of downtime.
A backup that can't be restored isn't a backup. It's a false sense of security.
The 3-2-1 backup rule
The 3-2-1 rule has been the standard for decades because it works:
- 3 copies of your data (the original + 2 backups)
- 2 different types of storage media
- 1 copy offsite (geographically separate)
Why each number matters
3 copies — Redundancy. If one backup fails, you have another. Drives fail, cloud services have outages, files get corrupted. Two backups means you can survive any single failure.
2 different media types — Protection against common-cause failures. If all your data is on the same type of drive from the same manufacturer, a firmware bug could kill them all simultaneously. Mix local drives with cloud storage, or SSDs with tape, or NAS with object storage.
1 offsite — Protection against physical disasters. Fire, flood, theft, or ransomware that spreads through your network. If your office burns down, your data survives because a copy exists elsewhere.
The modern 3-2-1-1-0 rule
Some organizations extend this to 3-2-1-1-0:
- 1 copy that's air-gapped or immutable (can't be modified or deleted, even by admins)
- 0 errors — backups are verified and tested
The extra "1" protects against ransomware that specifically targets backups. If attackers get admin access, they often delete backups before encrypting production data. An immutable or air-gapped copy survives even that.
Backup types: full, incremental, differential
Not every backup needs to copy everything every time.
Full backup — Complete copy of all data. Takes longest, uses most storage, but simplest to restore. Run weekly or monthly.
Incremental backup — Only data changed since the last backup (any type). Fast and small, but restoration requires the last full backup plus all incremental backups in sequence. If one incremental backup is corrupted, everything after it is lost.
Differential backup — Only data changed since the last full backup. Larger than incremental, but restoration only requires the last full plus the latest differential.
Which to use when
| Strategy | Backup time | Storage use | Restore complexity | Best for |
|---|---|---|---|---|
| Full only | Slow | High | Simple | Small datasets, weekly archives |
| Full + Incremental | Fast daily | Low | Complex (chain) | Large datasets, daily backups |
| Full + Differential | Medium | Medium | Simple | Balance of speed and reliability |
Practical recommendation for small companies:
- Full backup weekly (Sunday night)
- Incremental or differential daily
- Keep at least 4 weeks of full backups
Most modern tools (Restic, Borg, cloud backup services) handle this automatically with deduplication — you get the storage efficiency of incremental backups with the simplicity of full backups.
What to backup
Before configuring anything, inventory what actually needs backup.
Critical data categories
Business data
- Customer databases
- Financial records
- Contracts and legal documents
- Employee records
- Intellectual property (source code, designs, documentation)
System configurations
- Server configurations
- Network device configs
- Application settings
- Infrastructure as Code files
- SSL certificates and keys
SaaS data
- Google Workspace (emails, docs, drive)
- Microsoft 365
- Slack/Teams history
- CRM data (Salesforce, HubSpot)
- Project management (Asana, Jira, Notion)
Databases
- Production databases
- Application databases
- Analytics data
Secrets and credentials
- Password manager exports (Passwork supports encrypted exports — schedule them monthly)
- API keys (stored securely in password manager like Passwork)
- Encryption keys
- SSH keys
- Backup encryption keys themselves (store separately — in Passwork or offline)
What doesn't need backup
- Temporary files and caches
- Easily re-downloadable software
- Data that's already stored elsewhere (mirrors, replicas)
- Log files older than retention requirements
Focus backup resources on data that's irreplaceable or expensive to recreate.
Data classification for backup priority
| Category | Examples | Backup frequency | Retention | Recovery priority |
|---|---|---|---|---|
| Critical | Customer data, financial, production DB | Continuous/hourly | 1+ year | Immediate |
| Important | Source code, configs, documents | Daily | 90 days | Within hours |
| Standard | Internal docs, project files | Daily/weekly | 30 days | Within 24h |
| Low | Archives, old projects | Weekly/monthly | 30 days | Best effort |
Backup methods and tools
Different data needs different backup approaches.
Local/on-premise servers
For Linux servers:
Restic — Fast, encrypted, deduplicated backups. Supports local, S3, SFTP, and many other backends.
# Install
apt install restic
# Initialize repository (to S3)
restic init -r s3:s3.amazonaws.com/your-backup-bucket
# Backup a directory
restic backup /var/www /etc /home
# List snapshots
restic snapshots
# Restore
restic restore latest --target /restore/path
BorgBackup — Similar to Restic, excellent deduplication, slightly more complex.
# Initialize
borg init --encryption=repokey /path/to/backup/repo
# Backup
borg create /path/to/repo::backup-{now} /home /etc
# Restore
borg extract /path/to/repo::backup-name
For databases:
PostgreSQL:
# Automated daily backup
pg_dump -h localhost -U postgres mydatabase | gzip > /backup/mydb_$(date +%Y%m%d).sql.gz
# With pg_basebackup for point-in-time recovery
pg_basebackup -h localhost -D /backup/base -Ft -z -P
MySQL/MariaDB:
# Dump all databases
mysqldump --all-databases --single-transaction | gzip > /backup/all_$(date +%Y%m%d).sql.gz
# Or use Percona XtraBackup for hot backups
xtrabackup --backup --target-dir=/backup/xtrabackup/
MongoDB:
mongodump --out /backup/mongo/$(date +%Y%m%d)
Cloud infrastructure
AWS:
- S3 versioning — Enable versioning on critical buckets. Protects against accidental deletion and ransomware.
- AWS Backup — Managed backup for EC2, RDS, EFS, DynamoDB. Centralized policies and retention.
- RDS automated backups — Enable with appropriate retention (default 7 days, can extend to 35).
- EBS snapshots — Schedule regular snapshots for EC2 volumes.
# Enable S3 versioning
aws s3api put-bucket-versioning --bucket my-bucket --versioning-configuration Status=Enabled
# Create EBS snapshot
aws ec2 create-snapshot --volume-id vol-1234567890 --description "Daily backup"
GCP:
- Cloud Storage versioning — Similar to S3
- Persistent Disk snapshots — Schedule via snapshot schedules
- Cloud SQL automated backups — Enabled by default, configure retention
Azure:
- Azure Backup — Managed service for VMs, SQL, file shares
- Blob Storage versioning
- Azure Site Recovery — For disaster recovery
SaaS data backup
Your cloud apps have your data, but they don't guarantee backups the way you need.
Google Workspace:
Google's built-in retention is limited. Use third-party backup:
- Backupify — Part of Datto, comprehensive
- Spanning — Also Kaseya, good for SMB
- AFI Backup — AI-powered, competitive pricing
Or use Google Vault for retention (included in some plans, but not a true backup).
Microsoft 365:
Similar situation — Microsoft's retention isn't backup:
- Veeam Backup for Microsoft 365 — Self-hosted or cloud
- Druva — Cloud-native
- AvePoint — Comprehensive
Slack:
Slack's data export has limitations (no DMs on free/pro plans). Consider:
General SaaS:
Many smaller SaaS apps don't have backup integrations. For these:
- Export data manually on a schedule
- Use Zapier/Make to automate exports where possible
- Document what data lives where and how to export it
Container and Kubernetes backups
Velero — The standard for Kubernetes backup:
# Install Velero
velero install --provider aws --bucket my-backup-bucket --secret-file ./credentials
# Backup namespace
velero backup create my-backup --include-namespaces my-namespace
# Schedule regular backups
velero schedule create daily-backup --schedule="0 2 * * *" --include-namespaces production
# Restore
velero restore create --from-backup my-backup
Docker volumes:
# Backup a volume
docker run --rm -v my_volume:/source -v /backup:/backup alpine tar czf /backup/my_volume.tar.gz -C /source .
# Restore
docker run --rm -v my_volume:/target -v /backup:/backup alpine tar xzf /backup/my_volume.tar.gz -C /target
Immutable and air-gapped backups
Ransomware operators specifically target backups. If they can delete your backups, you have to pay. Immutable and air-gapped backups protect against this.
Immutable storage
Immutable storage prevents modification or deletion for a set period, even by administrators.
AWS S3 Object Lock:
# Enable Object Lock when creating bucket
aws s3api create-bucket --bucket my-backup-bucket --object-lock-enabled-for-bucket
# Set default retention
aws s3api put-object-lock-configuration --bucket my-backup-bucket --object-lock-configuration '{
"ObjectLockEnabled": "Enabled",
"Rule": {
"DefaultRetention": {
"Mode": "GOVERNANCE",
"Days": 30
}
}
}'
Governance mode allows override with special permissions. Compliance mode cannot be overridden by anyone, including root.
Azure Immutable Blob Storage: Time-based retention or legal hold policies.
Backblaze B2: Object Lock support, very cost-effective.
Air-gapped backups
Air-gapped means physically or logically disconnected from your network.
Physical air gap:
- External drives that are only connected during backup
- Tape backups stored offsite
- Offline storage rotated weekly/monthly
Logical air gap:
- Separate cloud account with no network path from production
- Backups pulled (not pushed) by a system with different credentials
- Write-only access — backup service can write but not delete
Recommended architecture for ransomware protection
Production Environment
│
▼
[Primary Backup] ──────► AWS S3 (same account)
│ Daily, 30-day retention
│
▼
[Secondary Backup] ────► AWS S3 (different account)
│ Object Lock enabled
│ Cross-account, no delete permissions
│
▼
[Tertiary Backup] ─────► Offline/Air-gapped
Weekly to external drive
Stored offsite
Backup automation
Manual backups don't happen. Automate everything.
Scheduling with cron
# /etc/cron.d/backup
# Daily database backup at 2 AM
0 2 * * * root /usr/local/bin/backup-database.sh >> /var/log/backup.log 2>&1
# Weekly full system backup at 3 AM Sunday
0 3 * * 0 root /usr/local/bin/backup-full.sh >> /var/log/backup.log 2>&1
# Monthly offsite sync at 4 AM on the 1st
0 4 1 * * root /usr/local/bin/backup-offsite.sh >> /var/log/backup.log 2>&1
Sample backup script
#!/bin/bash
# /usr/local/bin/backup-database.sh
set -e
BACKUP_DIR="/backup/database"
DATE=$(date +%Y%m%d_%H%M%S)
RETENTION_DAYS=30
# Create backup
pg_dump -h localhost -U postgres mydb | gzip > "$BACKUP_DIR/mydb_$DATE.sql.gz"
# Verify backup is not empty
if [ ! -s "$BACKUP_DIR/mydb_$DATE.sql.gz" ]; then
echo "ERROR: Backup file is empty" | mail -s "Backup Failed" [email protected]
exit 1
fi
# Upload to S3
aws s3 cp "$BACKUP_DIR/mydb_$DATE.sql.gz" s3://my-backup-bucket/database/
# Clean old local backups
find "$BACKUP_DIR" -name "*.sql.gz" -mtime +$RETENTION_DAYS -delete
# Log success
echo "$(date): Backup completed successfully"
Monitoring backups
Backups fail silently. Monitor them:
Check for completion:
# Alert if no backup file created in last 25 hours
find /backup -name "*.gz" -mtime -1 | grep -q . || alert "Backup missing"
Cloud monitoring:
- AWS CloudWatch for AWS Backup
- Third-party: Healthchecks.io — Free for simple monitoring
- Prometheus/Grafana for metrics
Simple monitoring with healthchecks.io:
# Add to end of backup script
curl -fsS -m 10 --retry 5 https://hc-ping.com/your-uuid-here
If the ping doesn't arrive on schedule, you get alerted.
Testing recovery
The backup isn't complete until you've restored from it.
Why testing matters
- Backup files can be corrupted
- Restore process might have changed
- You might be missing critical files
- The person who knows how to restore might not be available
Recovery testing schedule
| Test type | Frequency | What to test |
|---|---|---|
| File restore | Monthly | Restore random files from backup |
| Database restore | Quarterly | Restore to test environment, verify data |
| Full system restore | Annually | Restore entire server/service from scratch |
| Disaster recovery drill | Annually | Simulate complete outage, time full recovery |
How to test a database restore
-
Create test environment — Don't restore to production!
-
Restore the backup:
# PostgreSQL
gunzip -c backup.sql.gz | psql -h test-server -U postgres -d testdb
# MySQL
gunzip -c backup.sql.gz | mysql -h test-server -u root -p testdb
- Verify data integrity:
-- Check row counts
SELECT COUNT(*) FROM important_table;
-- Check for recent data
SELECT MAX(created_at) FROM transactions;
-- Run application smoke tests
- Document results:
- Time taken to restore
- Any errors encountered
- Data verification results
- Improvement needed
Full disaster recovery test
Once a year, simulate a complete disaster:
- Pick a non-critical system or use a test environment
- Pretend the primary is destroyed — Don't access it at all
- Restore from backups only
- Measure:
- Recovery Time Objective (RTO): How long until service is restored?
- Recovery Point Objective (RPO): How much data was lost?
- Document everything: What worked, what was hard, what was missing
This reveals gaps you won't find any other way.
Compliance and regulatory requirements
If you handle personal data or work with enterprise customers, backups aren't just good practice — they're often legally required.
GDPR
GDPR doesn't say "you must have backups" explicitly, but Article 32 requires:
"the ability to restore the availability and access to personal data in a timely manner in the event of a physical or technical incident"
This means:
- You need backups of any system containing personal data
- Backups must be encrypted (Article 32 requires "encryption of personal data")
- Backup locations matter — storing backups outside EU/EEA requires appropriate transfer mechanisms (Standard Contractual Clauses, adequacy decisions)
- Retention periods apply to backups too — you can't keep backup data forever
- Access controls — backup access should be limited and logged
The right to erasure problem: Article 17 gives people the right to have their data deleted. But deleting specific records from backups is technically difficult. The accepted approach:
- Delete from production systems immediately
- Document that the data exists in backups with a specific expiration date
- When backup expires or is restored, apply deletion again
- Don't restore deleted data to production
Documentation required:
- Backup retention periods
- Encryption methods used
- Backup location (country/region)
- Access log retention
ISO 27001
ISO 27001 Annex A has specific controls for backup:
A.12.3 Information backup:
- A.12.3.1 requires backup copies of information, software, and system images to be taken and tested regularly
What "tested regularly" means:
- Define testing frequency in your policy (monthly minimum recommended)
- Document test results
- Include backup verification in your ISMS scope
Additional controls that affect backups:
- A.8.2 (Information classification) — backup retention should match data classification
- A.11.1 (Secure areas) — physical protection of backup media and storage
- A.12.4 (Logging and monitoring) — backup access should be logged
- A.18.1.3 (Protection of records) — some records have legally mandated retention
SOC 2
SOC 2 Trust Services Criteria include:
Availability (A1.2):
- Recovery objectives (RTO/RPO) must be documented and tested
- Backup procedures must be documented
- Recovery testing must occur regularly
Common Controls:
- CC6.1 — Logical access to backup systems must be controlled
- CC7.2 — System changes (including backup configuration) must be managed
- CC7.3 — Backup integrity must be verified
For SOC 2 audits, you'll need to demonstrate:
- Documented backup policy
- Evidence of backup execution (logs)
- Evidence of recovery testing (test results with dates)
- Access controls on backup systems
PCI-DSS (if you handle payment data)
PCI-DSS 4.0 requirements:
- Requirement 9.4.1: Backups containing cardholder data must be stored securely
- Requirement 3.1: Storage of cardholder data should be minimized — includes backup retention
- Requirement 12.10.1: Incident response plan must address backup procedures
Practical compliance checklist
| Requirement | GDPR | ISO 27001 | SOC 2 | Your status |
|---|---|---|---|---|
| Documented backup policy | ✓ | ✓ | ✓ | ☐ |
| Encryption at rest | ✓ | ✓ | ✓ | ☐ |
| Encryption in transit | ✓ | ✓ | ✓ | ☐ |
| Defined retention periods | ✓ | ✓ | ✓ | ☐ |
| Regular testing documented | Implied | ✓ | ✓ | ☐ |
| Access controls and logging | ✓ | ✓ | ✓ | ☐ |
| Backup location documented | ✓ | ✓ | ✓ | ☐ |
| RTO/RPO defined | Implied | ✓ | ✓ | ☐ |
The backup policy document
Create a written policy so everyone knows the plan.
Backup policy template
[Company Name] Data Backup Policy
Purpose: Ensure critical data can be recovered in case of hardware failure, ransomware, accidental deletion, or disaster.
Scope: All company data including servers, databases, SaaS applications, and employee workstations.
Backup Schedule:
| Data Type | Frequency | Retention | Location |
|---|---|---|---|
| Production databases | Hourly | 30 days | AWS S3 + cross-account |
| File servers | Daily | 90 days | Backblaze B2 |
| SaaS (Google Workspace) | Daily | 1 year | Backupify |
| System configurations | On change + weekly | 90 days | Git + S3 |
| Employee laptops | Continuous | 30 days | [Backup service] |
3-2-1 Implementation:
- Copy 1: Production environment
- Copy 2: Primary cloud backup (AWS S3)
- Copy 3: Secondary cloud backup (Backblaze B2, different provider)
- Offsite: All cloud backups are geographically distributed
- Immutable: S3 Object Lock enabled on secondary backup
Testing:
- Monthly: Random file restore verification
- Quarterly: Full database restore to test environment
- Annually: Complete disaster recovery drill
Recovery Objectives:
- RTO (Recovery Time): 4 hours for critical systems, 24 hours for standard
- RPO (Recovery Point): 1 hour for databases, 24 hours for files
Responsibilities:
- Backup configuration and monitoring: [Name/Role]
- Recovery testing: [Name/Role]
- Policy review: [Name/Role], annually
- Private key holder (restore authority): [Name/Role]
- Emergency access holder: [Name/Role]
Access Control:
- Backup access register location: [Link/location]
- Access review frequency: Quarterly
- Authorized restore personnel: [List names]
Recovery Procedure:
- Assess what needs to be recovered
- Access backup storage credentials from Passwork
- Follow runbook in [location]
- Verify recovery completeness
- Document incident
Compliance:
- Personal data included: Yes/No
- Backup encryption: AES-256 (Restic)
- Encryption key location: Passwork vault "Backup Keys"
- Storage regions: [EU/US/other — document for GDPR]
- Right to erasure handling: Apply deletion on restore, backups expire after [X] days
Last Updated: [Date] Next Review: [Date + 1 year]
Common backup mistakes
Mistake 1: Backing up to the same system
If ransomware encrypts your server and the attached USB drive, both are gone. Cloud sync folders (Dropbox, OneDrive, Google Drive) are not backup — they sync the encrypted files too.
Fix: Backups must be on separate systems with separate credentials.
Mistake 2: No offsite copy
Fire, flood, theft, or a disgruntled employee with admin access can destroy on-premise backups.
Fix: At least one copy in a different physical location or cloud region.
Mistake 3: Never testing restores
You find out the backup doesn't work when you desperately need it.
Fix: Scheduled restore testing, documented results.
Mistake 4: Backing up the wrong things
You back up the file server but forget about the database, the SaaS apps, or the configuration files that make everything work.
Fix: Complete inventory of what needs backup before configuring anything.
Mistake 5: No monitoring
Backup runs fail silently. Nobody notices for months.
Fix: Automated monitoring and alerting for backup jobs.
Mistake 6: Credentials in one place
The backup encryption password is on the server that got encrypted.
Fix: Store backup credentials in password manager (Passwork), accessible by multiple authorized people.
Mistake 7: Losing the encryption key
Encrypted backups with a lost key are worthless. The person who set up backups left, and nobody knows the passphrase.
Fix: Store encryption keys in Passwork, create a paper backup in a secure location, and test recovery with those keys annually.
Mistake 8: Not accounting for GDPR deletion requests
Someone requests data deletion under GDPR. You delete from production, but their data lives on in 30+ backup copies.
Fix: Document your approach: data deleted from production immediately, deletion applied to any restored backups, backups expire within [X] days.
Backup access control
Who can access your backups?
Backups often contain more sensitive data than production systems — they're a complete snapshot of everything. Yet backup access is often an afterthought.
Maintain a backup access register:
| Person | Role | Access level | Granted date | Last reviewed |
|---|---|---|---|---|
| [Name] | Security Champion | Full (encrypt/decrypt/restore) | 2024-01-15 | 2024-06-01 |
| [Name] | DevOps Lead | Read/restore only | 2024-02-01 | 2024-06-01 |
| [Name] | CTO | Emergency access (sealed credentials) | 2024-01-15 | 2024-06-01 |
Rules for backup access:
- Minimum necessary — Most people don't need backup access. Developers rarely need it.
- Separate from production — Backup credentials should be different from production credentials
- Review quarterly — Remove access for people who left or changed roles
- Log access — Know who accessed backups and when
- Two-person rule for critical data — Consider requiring two people to restore the most sensitive backups
What counts as critical data in backups?
Some data requires extra protection:
- Customer personal data (GDPR scope)
- Financial records and payment data
- Authentication credentials and secrets
- Encryption keys
- Source code (intellectual property)
- Health records (HIPAA scope)
- Employee personal information
If your backups contain any of these, consider the asymmetric encryption approach below.
Hybrid encryption: Security Champion controls restoration
A practical setup where only the Security Champion (or designated backup admin) can decrypt backups. Uses hybrid encryption: fast AES-256 for the data, RSA for key protection.
Backup creation (anyone with backup script access can run this):
- Generate a random AES-256 key
- Encrypt backup data with that AES key
- Encrypt the AES key with the RSA public key
- Store both:
encrypted_backup+encrypted_key
Restoration (only the Security Champion):
- Decrypt the AES key using the RSA private key
- Decrypt the backup data with the AES key
- Restore
Why this pattern:
- Backup scripts can run automatically (only need public key)
- Only the private key holder can restore
- If backup storage is compromised, data is still encrypted
- AES-256 is fast for large files; RSA handles key exchange
Step-by-step with OpenSSL
1. Generate RSA key pair (Security Champion does this once):
# Generate 4096-bit RSA private key
openssl genrsa -aes256 -out backup_private.pem 4096
# Extract public key
openssl rsa -in backup_private.pem -pubout -out backup_public.pem
# Private key: store securely (Passwork, offline, safety deposit box)
# Public key: distribute to backup servers
2. Backup script (runs automatically, uses only public key):
#!/bin/bash
# backup-encrypt.sh
BACKUP_FILE=$1
PUBLIC_KEY="/etc/backup/backup_public.pem"
OUTPUT_DIR="/backup/encrypted"
DATE=$(date +%Y%m%d_%H%M%S)
# Generate random 256-bit AES key
AES_KEY=$(openssl rand -hex 32)
# Encrypt backup with AES-256-CBC
openssl enc -aes-256-cbc -salt -pbkdf2 \
-in "$BACKUP_FILE" \
-out "${OUTPUT_DIR}/backup_${DATE}.enc" \
-pass pass:"$AES_KEY"
# Encrypt AES key with RSA public key
echo "$AES_KEY" | openssl rsautl -encrypt -pubin \
-inkey "$PUBLIC_KEY" \
-out "${OUTPUT_DIR}/backup_${DATE}.key.enc"
# Clean up
unset AES_KEY
rm "$BACKUP_FILE" # Remove unencrypted backup
echo "Backup encrypted: backup_${DATE}.enc + backup_${DATE}.key.enc"
3. Restore script (Security Champion only, requires private key):
#!/bin/bash
# backup-decrypt.sh
ENCRYPTED_BACKUP=$1
ENCRYPTED_KEY="${ENCRYPTED_BACKUP%.enc}.key.enc"
PRIVATE_KEY="/secure/backup_private.pem"
OUTPUT_FILE="${ENCRYPTED_BACKUP%.enc}.restored"
# Decrypt AES key using RSA private key
AES_KEY=$(openssl rsautl -decrypt \
-inkey "$PRIVATE_KEY" \
-in "$ENCRYPTED_KEY")
# Decrypt backup with AES key
openssl enc -aes-256-cbc -d -salt -pbkdf2 \
-in "$ENCRYPTED_BACKUP" \
-out "$OUTPUT_FILE" \
-pass pass:"$AES_KEY"
# Clean up
unset AES_KEY
echo "Backup decrypted: $OUTPUT_FILE"
4. Complete example — database backup with encryption:
#!/bin/bash
# full-db-backup.sh
DB_NAME="production"
BACKUP_DIR="/backup"
PUBLIC_KEY="/etc/backup/backup_public.pem"
DATE=$(date +%Y%m%d_%H%M%S)
# Dump database
pg_dump -h localhost -U postgres "$DB_NAME" | gzip > "${BACKUP_DIR}/db_${DATE}.sql.gz"
# Generate AES key and encrypt
AES_KEY=$(openssl rand -hex 32)
openssl enc -aes-256-cbc -salt -pbkdf2 \
-in "${BACKUP_DIR}/db_${DATE}.sql.gz" \
-out "${BACKUP_DIR}/db_${DATE}.sql.gz.enc" \
-pass pass:"$AES_KEY"
echo "$AES_KEY" | openssl rsautl -encrypt -pubin \
-inkey "$PUBLIC_KEY" \
-out "${BACKUP_DIR}/db_${DATE}.key.enc"
# Remove unencrypted file
rm "${BACKUP_DIR}/db_${DATE}.sql.gz"
# Upload to S3 (encrypted files only)
aws s3 cp "${BACKUP_DIR}/db_${DATE}.sql.gz.enc" s3://backup-bucket/db/
aws s3 cp "${BACKUP_DIR}/db_${DATE}.key.enc" s3://backup-bucket/db/
# Cleanup and notify
unset AES_KEY
echo "$(date): Database backup encrypted and uploaded" >> /var/log/backup.log
Key management for this setup
Private key storage:
- Primary: Passwork (encrypted, in a vault only Security Champion can access)
- Backup: Printed on paper, stored in company safe or safety deposit box
- Never on backup servers or in backup storage
Public key distribution:
- Can be stored on backup servers (it's public)
- Include in backup scripts
- Version control is fine (it's not secret)
Key rotation:
- Generate new key pair annually
- Keep old private keys to decrypt old backups
- Document which key encrypts which backup range
Emergency access:
- CTO or second person should have sealed copy of private key
- "Break glass" procedure documented
- Test emergency access annually
Practical tips that save money and pain
Verify backup integrity without full restore
Full restores take time. But you can verify backups are readable:
# Restic: check backup integrity
restic check
# Restic: check and read all data (slower, but catches more problems)
restic check --read-data
# Borg: verify repository
borg check /path/to/repo
# Check backup file isn't empty or corrupted
gzip -t backup.sql.gz && echo "OK" || echo "CORRUPTED"
Schedule integrity checks weekly. They're faster than full restores but catch most corruption.
Encryption key management
If you lose your backup encryption key, your backups are useless. This happens more often than you'd think.
Do this:
- Store encryption keys in Passwork (separate from the backup itself)
- Create a paper copy stored in a safe or safety deposit box
- Make sure at least two people know where keys are stored
- Test that keys work by restoring to a fresh system
Don't:
- Store the encryption key on the same server that's being backed up
- Have only one person who knows the key
- Use a password you might forget
Backup window optimization
Backups impact production systems. Minimize the pain:
Schedule smart:
# Run backup during low-traffic hours
0 3 * * * /usr/local/bin/backup.sh # 3 AM
# For global teams, find the quietest hour (check your analytics)
Throttle bandwidth:
# Restic: limit bandwidth to 50 MB/s
restic backup --limit-upload 50000 /data
# rsync: limit bandwidth
rsync --bwlimit=50000 /source /destination
# AWS S3: configure transfer acceleration or use multipart uploads
Use database-native tools:
- PostgreSQL:
pg_dumpwith--no-synchronized-snapshotsfor less locking - MySQL:
--single-transactionfor InnoDB tables - MongoDB: replica set backups from secondary
Deduplication to save storage costs
Deduplication stores each unique chunk of data once, even if it appears in multiple backups.
Tools with built-in deduplication:
- Restic, BorgBackup — excellent dedup
- AWS S3 — no native dedup, but Glacier saves money for archives
- Backblaze B2 — no dedup, but cheap enough it doesn't matter
Real savings: A 100GB dataset with daily backups for 30 days:
- Without dedup: 3TB storage
- With dedup (typical 90% reduction): 300GB storage
Backup your password manager
Your password manager (Passwork) contains keys to everything. If you lose it, recovery becomes much harder.
For Passwork:
- Use built-in encrypted export feature
- Schedule monthly exports
- Store exports in your backup system (encrypted separately)
- Self-hosted Passwork: backup the database and file storage as part of your infrastructure backup
Recovery scenario: If your infrastructure is encrypted and you can't access Passwork, you need the Passwork backup to get credentials for your other backups. Store at least one Passwork export in a completely separate location.
Grandfather-father-son rotation
For long-term retention without endless storage:
- Daily (son): Keep 7 daily backups
- Weekly (father): Keep 4 weekly backups (every Sunday)
- Monthly (grandfather): Keep 12 monthly backups (first of month)
This gives you a year of monthly restore points, a month of weekly points, and a week of daily points — with only 23 backup copies instead of 365.
Most backup tools support this with retention policies:
# Restic: keep 7 daily, 4 weekly, 12 monthly
restic forget --keep-daily 7 --keep-weekly 4 --keep-monthly 12 --prune
# BorgBackup
borg prune --keep-daily=7 --keep-weekly=4 --keep-monthly=12
Cross-region and cross-provider resilience
One cloud provider can have outages. For critical data:
Cross-region (same provider):
# AWS: replicate S3 bucket to another region
aws s3api put-bucket-replication --bucket source-bucket --replication-configuration file://replication.json
Cross-provider:
- Primary backup: AWS S3
- Secondary backup: Backblaze B2 or Google Cloud Storage
- Use
rcloneto sync between providers
# rclone: sync S3 to Backblaze B2
rclone sync s3:my-backup-bucket b2:my-backup-bucket --transfers 8
Why both: If AWS has a major outage, you can still access backups. If a provider has a security breach, your data is elsewhere.
Quick size sanity check
Add this to backup scripts to catch empty or suspiciously small backups:
#!/bin/bash
BACKUP_FILE=$1
MIN_SIZE_KB=1000 # Minimum expected size in KB
SIZE=$(du -k "$BACKUP_FILE" | cut -f1)
if [ "$SIZE" -lt "$MIN_SIZE_KB" ]; then
echo "WARNING: Backup file suspiciously small ($SIZE KB)" | mail -s "Backup Size Alert" [email protected]
exit 1
fi
Recovery procedures
Document these before you need them.
File recovery procedure
- Identify what needs to be recovered (file name, path, approximate date)
- Access backup storage (credentials in Passwork: "Backup - S3" entry)
- List available snapshots:
restic snapshots - Find the file:
restic find filename - Restore:
restic restore snapshot_id --target /restore --include path/to/file - Verify the restored file
- Move to production location if correct
Database recovery procedure
- Assess the damage (corrupted data, accidental deletion, complete loss)
- Determine recovery point (what time should we restore to?)
- Stop application access to prevent further writes
- Access backup credentials from Passwork
- Download backup file
- Restore to test environment first if time permits
- Restore to production
- Verify data integrity
- Resume application access
- Document what happened and what was lost
Full server recovery procedure
- Provision new server (same specs as failed server)
- Install base OS
- Retrieve server configuration from backup
- Restore configuration files
- Restore application data
- Restore database
- Verify all services are running
- Update DNS if needed
- Test thoroughly before declaring recovery complete
Tools and services
Backup software
Open source:
- Restic — Fast, encrypted, deduplicated
- BorgBackup — Excellent deduplication
- Duplicati — GUI-based, good for workstations
- Velero — Kubernetes backups
Commercial:
Backup storage
Cloud object storage:
- AWS S3 — Standard choice, Glacier for archives
- Backblaze B2 — Much cheaper than S3, S3-compatible API
- Wasabi — Cheap, no egress fees
- Google Cloud Storage — Good if you're on GCP
Backup-specific services:
- Backblaze Personal/Business — Simple workstation backup
- Tarsnap — Secure, pay-per-use, Unix-focused
Monitoring
- Healthchecks.io — Dead man's switch for cron jobs
- Cronitor — Similar, more features
- Cloud-native: AWS CloudWatch, GCP Monitoring, Azure Monitor
Workshop: implement your backup strategy
Block 3-4 hours for initial setup, then ongoing maintenance.
Part 1: Inventory and classification (45 minutes)
- List all data that needs backup
- Classify by priority (critical, important, standard, low)
- Identify current backup gaps
- Document where each type of data lives
Part 2: Set up primary backup (60 minutes)
- Choose backup tool (Restic, Borg, or cloud-native)
- Configure for your critical data
- Set up automated schedule
- Verify first backup completes successfully
- Document the configuration
Part 3: Set up offsite/immutable backup (45 minutes)
- Create secondary backup destination (different provider/account)
- Enable immutability if available
- Configure replication from primary
- Verify backup reaches secondary location
Part 4: Test recovery (30 minutes)
- Pick a random file from backup
- Restore it to a test location
- Verify contents match original
- Document the restore process
Part 5: Documentation (30 minutes)
- Write your backup policy (use template above)
- Document recovery procedures
- Store credentials in password manager
- Schedule recurring restore tests
Deliverables:
- Complete data inventory
- Automated backups running for critical data
- Offsite/immutable backup configured
- First recovery test completed
- Backup policy documented
- Recovery procedures documented
- Monitoring configured
Talking to leadership
If someone asks why you spent time on this:
"I implemented a comprehensive backup strategy following the 3-2-1 rule — three copies of our data, two different storage types, one offsite. I also enabled immutable storage so ransomware can't delete our backups even if attackers get admin access. And I tested recovery to make sure it actually works. If our servers were encrypted tomorrow, we could recover critical systems within 4 hours."
Short version: "I set up proper backups and tested that we can actually recover from them. Ransomware can't take us down."
Self-check: did you actually do it?
Backup coverage
- Inventoried all data that needs backup
- Classified data by priority
- Production databases are backed up
- File servers/storage are backed up
- Critical SaaS data has backup solution
- System configurations are backed up
3-2-1 implementation
- At least 3 copies of critical data exist
- Backups are on different storage types/providers
- At least one copy is offsite/cloud
- At least one copy is immutable or air-gapped
Automation and monitoring
- Backups run automatically on schedule
- Monitoring alerts if backup fails
- Backup logs are retained
Testing and documentation
- Completed at least one test restore
- Backup policy documented
- Recovery procedures documented
- Credentials stored in password manager
- Recovery test schedule created
Security and compliance
- Backups are encrypted (at rest and in transit)
- Encryption keys stored separately from backups
- Backup location documented (for GDPR)
- Retention periods defined
- At least two people can access backup credentials
Access control
- Backup access register maintained (who can access what)
- Backup credentials separate from production credentials
- Critical data backups use asymmetric encryption (RSA + AES)
- Private key stored securely (Passwork + offline backup)
- Emergency access procedure documented
If you can check off at least 18 of these 26 items, you're ready to move on.
What's next
Backups are solid. One more quick win before moving on to development security.
Next chapter: website protection with Cloudflare — DDoS protection, WAF, and bot mitigation for your public-facing sites.