Skip to content

Project 4: Disaster Recovery Solution ​

Overview ​

Implement a comprehensive disaster recovery solution using Azure Backup and Azure Site Recovery. This project covers VM backup, file-level recovery, and cross-region replication for business continuity.

Difficulty: Intermediate
Duration: 3-4 hours
Cost: ~$50-100/month (ASR, backup storage)

Architecture Diagram ​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                           PRIMARY REGION (East US)                               β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚                         VNet: vnet-primary (10.0.0.0/16)                   β”‚  β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚
β”‚  β”‚  β”‚                     Subnet: snet-workloads (10.0.1.0/24)            β”‚  β”‚  β”‚
β”‚  β”‚  β”‚                                                                      β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚  vm-web-01   β”‚  β”‚  vm-app-01   β”‚  β”‚  vm-sql-01               β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚  Web Server  β”‚  β”‚  App Server  β”‚  β”‚  SQL Server              β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚              β”‚  β”‚              β”‚  β”‚                          β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚ β”‚ Azure    β”‚ β”‚  β”‚ β”‚ Azure    β”‚ β”‚  β”‚ β”‚ Azure Backup         β”‚ β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚ β”‚ Backup   β”‚ β”‚  β”‚ β”‚ Backup   β”‚ β”‚  β”‚ β”‚ + SQL Backup         β”‚ β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚ β”‚ Agent    β”‚ β”‚  β”‚ β”‚ Agent    β”‚ β”‚  β”‚ β”‚                      β”‚ β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β”‚  β”‚ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β”‚  β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚  β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚               β”‚                 β”‚                       β”‚                        β”‚
β”‚               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                        β”‚
β”‚                                 β”‚                                                β”‚
β”‚                                 β–Ό                                                β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚              RECOVERY SERVICES VAULT (rsv-primary-eastus)                  β”‚  β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚
β”‚  β”‚  β”‚  Backup Policies:                                                    β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  - Daily backup at 2:00 AM UTC                                       β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  - Retain daily backups: 30 days                                     β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  - Retain weekly backups: 12 weeks                                   β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  - Retain monthly backups: 12 months                                 β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  - Retain yearly backups: 3 years                                    β”‚  β”‚  β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚
β”‚  β”‚                                                                            β”‚  β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚
β”‚  β”‚  β”‚  Azure Site Recovery (Replication to DR Region)                      β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  - RPO: 15 minutes                                                   β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  - RTO: < 2 hours                                                    β”‚  β”‚  β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                        β”‚
                                        β”‚ Continuous Replication
                                        β”‚ (Azure Site Recovery)
                                        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         SECONDARY REGION (West US 2)                             β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚                         VNet: vnet-dr (10.1.0.0/16)                        β”‚  β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚
β”‚  β”‚  β”‚                     Subnet: snet-dr (10.1.1.0/24)                    β”‚  β”‚  β”‚
β”‚  β”‚  β”‚                                                                      β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚  vm-web-01   β”‚  β”‚  vm-app-01   β”‚  β”‚  vm-sql-01               β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚  (Replica)   β”‚  β”‚  (Replica)   β”‚  β”‚  (Replica)               β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚  STANDBY     β”‚  β”‚  STANDBY     β”‚  β”‚  STANDBY                 β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚              β”‚  β”‚              β”‚  β”‚                          β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚  Powered off β”‚  β”‚  Powered off β”‚  β”‚  Powered off             β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚  until       β”‚  β”‚  until       β”‚  β”‚  until                   β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β”‚  failover    β”‚  β”‚  failover    β”‚  β”‚  failover                β”‚  β”‚  β”‚  β”‚
β”‚  β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚  β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚             GRS Storage (Backup Data Replication)                          β”‚  β”‚
β”‚  β”‚             - Automatic geo-replication of backup data                     β”‚  β”‚
β”‚  β”‚             - Cross-region restore capability                              β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Disaster Recovery Flow:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Disaster │───▢│  Detect  │───▢│ Failover │───▢│  VMs Active  β”‚
β”‚  Event   β”‚    β”‚  & Alert β”‚    β”‚  to DR   β”‚    β”‚  in DR Regionβ”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

What You'll Learn ​

  • Configure Azure Backup for VMs
  • Create and manage backup policies
  • Perform file-level and full VM restore
  • Set up Azure Site Recovery (ASR)
  • Execute test failover and planned failover
  • Implement recovery plans with automation

Prerequisites ​

  • Azure subscription
  • Azure CLI installed
  • Two Azure regions available

Phase 1: Set Up Primary Infrastructure ​

Step 1.1: Create Resource Groups ​

bash
# Set variables
PRIMARY_LOCATION="eastus"
DR_LOCATION="westus2"
RG_PRIMARY="rg-dr-lab-eastus"
RG_DR="rg-dr-lab-westus2"

# Create primary resource group
az group create \
  --name $RG_PRIMARY \
  --location $PRIMARY_LOCATION \
  --tags Project=DisasterRecovery Environment=Lab Role=Primary

# Create DR resource group
az group create \
  --name $RG_DR \
  --location $DR_LOCATION \
  --tags Project=DisasterRecovery Environment=Lab Role=DR

echo "Resource groups created in both regions"

Step 1.2: Create Primary VNet ​

bash
# Create primary VNet
az network vnet create \
  --resource-group $RG_PRIMARY \
  --name vnet-primary \
  --address-prefix 10.0.0.0/16 \
  --subnet-name snet-workloads \
  --subnet-prefix 10.0.1.0/24 \
  --location $PRIMARY_LOCATION

# Add Azure Bastion subnet
az network vnet subnet create \
  --resource-group $RG_PRIMARY \
  --vnet-name vnet-primary \
  --name AzureBastionSubnet \
  --address-prefix 10.0.2.0/27

Step 1.3: Create DR VNet ​

bash
# Create DR VNet (same address space - will be used for failover)
az network vnet create \
  --resource-group $RG_DR \
  --name vnet-dr \
  --address-prefix 10.1.0.0/16 \
  --subnet-name snet-dr \
  --subnet-prefix 10.1.1.0/24 \
  --location $DR_LOCATION

Step 1.4: Deploy Primary VMs ​

bash
ADMIN_USER="azureadmin"
ADMIN_PASSWORD="P@ssw0rd123!Complex"

# Create Web VM
az vm create \
  --resource-group $RG_PRIMARY \
  --name vm-web-01 \
  --vnet-name vnet-primary \
  --subnet snet-workloads \
  --image Win2022Datacenter \
  --size Standard_D2s_v3 \
  --admin-username $ADMIN_USER \
  --admin-password $ADMIN_PASSWORD \
  --public-ip-address "" \
  --no-wait

# Create App VM
az vm create \
  --resource-group $RG_PRIMARY \
  --name vm-app-01 \
  --vnet-name vnet-primary \
  --subnet snet-workloads \
  --image Ubuntu2204 \
  --size Standard_D2s_v3 \
  --admin-username $ADMIN_USER \
  --generate-ssh-keys \
  --public-ip-address "" \
  --no-wait

# Create SQL VM
az vm create \
  --resource-group $RG_PRIMARY \
  --name vm-sql-01 \
  --vnet-name vnet-primary \
  --subnet snet-workloads \
  --image MicrosoftSQLServer:sql2022-ws2022:standard-gen2:latest \
  --size Standard_D4s_v3 \
  --admin-username $ADMIN_USER \
  --admin-password $ADMIN_PASSWORD \
  --public-ip-address "" \
  --no-wait

# Wait for VMs
echo "Waiting for VMs to be created..."
az vm wait --resource-group $RG_PRIMARY --name vm-web-01 --created
az vm wait --resource-group $RG_PRIMARY --name vm-app-01 --created
az vm wait --resource-group $RG_PRIMARY --name vm-sql-01 --created

echo "All VMs created"

Phase 2: Configure Azure Backup ​

Step 2.1: Create Recovery Services Vault ​

bash
# Create Recovery Services Vault
az backup vault create \
  --resource-group $RG_PRIMARY \
  --name rsv-primary-eastus \
  --location $PRIMARY_LOCATION

echo "Recovery Services Vault created"

Step 2.2: Configure Vault Settings ​

bash
# Set vault storage redundancy to Geo-Redundant (GRS)
az backup vault backup-properties set \
  --resource-group $RG_PRIMARY \
  --name rsv-primary-eastus \
  --backup-storage-redundancy GeoRedundant

# Enable cross-region restore
az backup vault backup-properties set \
  --resource-group $RG_PRIMARY \
  --name rsv-primary-eastus \
  --cross-region-restore-flag Enabled

echo "Vault configured with GRS and cross-region restore"

Step 2.3: Create Backup Policy ​

bash
# Create enhanced backup policy JSON
cat > backup-policy.json << 'EOF'
{
  "properties": {
    "backupManagementType": "AzureIaasVM",
    "schedulePolicy": {
      "schedulePolicyType": "SimpleSchedulePolicy",
      "scheduleRunFrequency": "Daily",
      "scheduleRunTimes": ["2024-01-01T02:00:00Z"],
      "scheduleWeeklyFrequency": 0
    },
    "retentionPolicy": {
      "retentionPolicyType": "LongTermRetentionPolicy",
      "dailySchedule": {
        "retentionTimes": ["2024-01-01T02:00:00Z"],
        "retentionDuration": {
          "count": 30,
          "durationType": "Days"
        }
      },
      "weeklySchedule": {
        "daysOfTheWeek": ["Sunday"],
        "retentionTimes": ["2024-01-01T02:00:00Z"],
        "retentionDuration": {
          "count": 12,
          "durationType": "Weeks"
        }
      },
      "monthlySchedule": {
        "retentionScheduleFormatType": "Weekly",
        "retentionScheduleWeekly": {
          "daysOfTheWeek": ["Sunday"],
          "weeksOfTheMonth": ["First"]
        },
        "retentionTimes": ["2024-01-01T02:00:00Z"],
        "retentionDuration": {
          "count": 12,
          "durationType": "Months"
        }
      },
      "yearlySchedule": {
        "retentionScheduleFormatType": "Weekly",
        "monthsOfYear": ["January"],
        "retentionScheduleWeekly": {
          "daysOfTheWeek": ["Sunday"],
          "weeksOfTheMonth": ["First"]
        },
        "retentionTimes": ["2024-01-01T02:00:00Z"],
        "retentionDuration": {
          "count": 3,
          "durationType": "Years"
        }
      }
    },
    "instantRpRetentionRangeInDays": 5,
    "timeZone": "UTC"
  }
}
EOF

# Create the policy using Azure Portal or REST API
# CLI alternative: Use default policy and modify
az backup policy set \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --name EnhancedVMPolicy \
  --policy backup-policy.json 2>/dev/null || echo "Use portal to create custom policy"

Step 2.4: Enable Backup for VMs ​

bash
# Enable backup for Web VM
az backup protection enable-for-vm \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --vm vm-web-01 \
  --policy-name DefaultPolicy

# Enable backup for App VM
az backup protection enable-for-vm \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --vm vm-app-01 \
  --policy-name DefaultPolicy

# Enable backup for SQL VM
az backup protection enable-for-vm \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --vm vm-sql-01 \
  --policy-name DefaultPolicy

echo "Backup enabled for all VMs"

Step 2.5: Trigger Initial Backup ​

bash
# Get container names
WEB_CONTAINER=$(az backup container list \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --backup-management-type AzureIaasVM \
  --query "[?contains(name, 'vm-web-01')].name" -o tsv)

# Trigger backup for Web VM
az backup protection backup-now \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --container-name $WEB_CONTAINER \
  --item-name vm-web-01 \
  --retain-until $(date -d "+30 days" +%Y-%m-%d) \
  --backup-management-type AzureIaasVM

echo "Initial backup triggered"

Phase 3: Configure Azure Site Recovery ​

Step 3.1: Create Cache Storage Account ​

bash
# Create cache storage account for ASR
CACHE_STORAGE="asrcache$(date +%s | tail -c 8)"
az storage account create \
  --resource-group $RG_PRIMARY \
  --name $CACHE_STORAGE \
  --location $PRIMARY_LOCATION \
  --sku Standard_LRS \
  --kind StorageV2

echo "Cache storage account created: $CACHE_STORAGE"

Step 3.2: Configure Site Recovery (Azure Portal) ​

Portal Configuration Required

Some ASR configurations are easier through the Azure Portal. Follow these steps:

  1. Navigate to Recovery Services Vault:

    • Go to rsv-primary-eastus
    • Click "Site Recovery" β†’ "Prepare Infrastructure"
  2. Configure Protection Goal:

    yaml
    Where are your machines located: Azure
    Where do you want to replicate: To Azure
  3. Configure Source Settings:

    yaml
    Source: East US
    Subscription: Your subscription
    Resource group: rg-dr-lab-eastus
    Deployment model: Resource Manager
  4. Configure Target Settings:

    yaml
    Target region: West US 2
    Target subscription: Same
    Target resource group: rg-dr-lab-westus2
    Target virtual network: vnet-dr
    Cache storage account: asrcache[timestamp]

Step 3.3: Enable Replication via CLI ​

bash
# Get source VM details
WEB_VM_ID=$(az vm show -g $RG_PRIMARY -n vm-web-01 --query id -o tsv)
APP_VM_ID=$(az vm show -g $RG_PRIMARY -n vm-app-01 --query id -o tsv)

# Get target subnet ID
DR_SUBNET_ID=$(az network vnet subnet show \
  --resource-group $RG_DR \
  --vnet-name vnet-dr \
  --name snet-dr \
  --query id -o tsv)

# Enable replication for Web VM (using portal is recommended for first time)
echo "Enable replication through Portal:"
echo "1. Recovery Services Vault β†’ Site Recovery β†’ Replicated items"
echo "2. Click + Replicate"
echo "3. Select VMs: vm-web-01, vm-app-01, vm-sql-01"
echo "4. Configure target settings"
echo "5. Enable replication"

Step 3.4: Monitor Replication Status ​

bash
# List replicated items
az site-recovery replicated-item list \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --output table

# Check replication health
az site-recovery replicated-item show \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --replicated-protected-item-name vm-web-01 \
  --query "properties.replicationHealth" -o tsv

Phase 4: Create Recovery Plan ​

Step 4.1: Create Recovery Plan (Portal) ​

  1. Navigate to Recovery Services Vault β†’ Recovery Plans
  2. Click "+ Recovery plan"
  3. Configure:
yaml
Name: rp-webapp-dr
Source: East US
Target: West US 2
Allow items with deployment model: Resource Manager
Select items:
  Group 1: vm-sql-01 (Database - start first)
  Group 2: vm-app-01 (Application tier)
  Group 3: vm-web-01 (Web tier - start last)

Step 4.2: Add Pre/Post Actions ​

Add automation runbooks to recovery plan:

Pre-failover script (sample):

powershell
# pre-failover.ps1
param(
    [Parameter(Mandatory=$true)]
    [string]$RecoveryPlanContext
)

# Parse context
$context = ConvertFrom-Json $RecoveryPlanContext

# Send notification
$webhook = "https://your-webhook-url"
$body = @{
    text = "DR Failover initiated for $($context.RecoveryPlanName)"
} | ConvertTo-Json

Invoke-RestMethod -Uri $webhook -Method Post -Body $body -ContentType "application/json"

Post-failover script (sample):

powershell
# post-failover.ps1
param(
    [Parameter(Mandatory=$true)]
    [string]$RecoveryPlanContext
)

$context = ConvertFrom-Json $RecoveryPlanContext

# Update DNS records
# Connect to DNS provider API and update records

# Verify services are running
# Test application endpoints

Write-Output "Post-failover tasks completed"

Phase 5: Testing Disaster Recovery ​

Step 5.1: Test Failover (Non-disruptive) ​

bash
# Perform test failover via CLI
az site-recovery recovery-plan test-failover \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --name rp-webapp-dr \
  --direction PrimaryToRecovery \
  --network-type VmNetworkAsInput

Or via Portal:

  1. Go to Recovery Plans β†’ rp-webapp-dr
  2. Click "Test failover"
  3. Select recovery point (Latest processed recommended)
  4. Select test VNet (creates isolated environment)
  5. Click OK

Step 5.2: Validate Test Failover ​

bash
# List VMs in DR region (test VMs will have -test suffix)
az vm list --resource-group $RG_DR --output table

# Verify test VMs are running
az vm get-instance-view \
  --resource-group $RG_DR \
  --name vm-web-01-test \
  --query instanceView.statuses[1].displayStatus -o tsv

Validation Checklist:

  • [ ] VMs are powered on
  • [ ] Network connectivity works
  • [ ] Applications respond correctly
  • [ ] Data is consistent

Step 5.3: Cleanup Test Failover ​

bash
# Cleanup test failover
az site-recovery recovery-plan test-failover-cleanup \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --name rp-webapp-dr

echo "Test failover cleanup initiated"

Phase 6: File-Level Recovery ​

Step 6.1: Generate Recovery Script ​

bash
# Get latest recovery point
az backup recoverypoint list \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --container-name $WEB_CONTAINER \
  --item-name vm-web-01 \
  --output table

# Note the recovery point name (e.g., "12345678901234")

Step 6.2: Mount Recovery Volume (Portal) ​

  1. Go to Recovery Services Vault β†’ Backup Items β†’ Azure Virtual Machine
  2. Select vm-web-01
  3. Click "File Recovery"
  4. Select recovery point
  5. Download and run the executable on a Windows machine
  6. Mounted disks appear as additional drives
  7. Browse and copy needed files
  8. Click "Unmount Disks" when done

Phase 7: Full VM Restore ​

Step 7.1: Restore VM to New Location ​

bash
# Get recovery point
RECOVERY_POINT=$(az backup recoverypoint list \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --container-name $WEB_CONTAINER \
  --item-name vm-web-01 \
  --query "[0].name" -o tsv)

# Restore VM
az backup restore restore-disks \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --container-name $WEB_CONTAINER \
  --item-name vm-web-01 \
  --rp-name $RECOVERY_POINT \
  --storage-account $CACHE_STORAGE \
  --target-resource-group $RG_PRIMARY

echo "Disk restore initiated. Monitor progress in portal."

Step 7.2: Cross-Region Restore ​

bash
# Restore to secondary region (requires cross-region restore enabled)
az backup restore restore-disks \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --container-name $WEB_CONTAINER \
  --item-name vm-web-01 \
  --rp-name $RECOVERY_POINT \
  --storage-account "storage-in-westus2" \
  --target-resource-group $RG_DR \
  --use-secondary-region

Phase 8: Monitoring and Alerts ​

Step 8.1: Configure Backup Alerts ​

bash
# Create action group
az monitor action-group create \
  --resource-group $RG_PRIMARY \
  --name ag-backup-alerts \
  --short-name BackupAlrt \
  --action email admin admin@contoso.com

# Configure backup alerts in vault
# This is done via Portal: 
# Recovery Services Vault β†’ Monitoring β†’ Alerts β†’ Manage alert rules

Step 8.2: View Backup Reports ​

bash
# Configure diagnostic settings
VAULT_ID=$(az backup vault show \
  --resource-group $RG_PRIMARY \
  --name rsv-primary-eastus \
  --query id -o tsv)

# Create Log Analytics workspace
az monitor log-analytics workspace create \
  --resource-group $RG_PRIMARY \
  --workspace-name law-backup

LAW_ID=$(az monitor log-analytics workspace show \
  --resource-group $RG_PRIMARY \
  --workspace-name law-backup \
  --query id -o tsv)

# Enable diagnostics
az monitor diagnostic-settings create \
  --resource $VAULT_ID \
  --name "backup-diagnostics" \
  --workspace $LAW_ID \
  --logs '[
    {"category": "AzureBackupReport", "enabled": true},
    {"category": "CoreAzureBackup", "enabled": true},
    {"category": "AddonAzureBackupJobs", "enabled": true},
    {"category": "AddonAzureBackupAlerts", "enabled": true},
    {"category": "AddonAzureBackupPolicy", "enabled": true},
    {"category": "AddonAzureBackupStorage", "enabled": true},
    {"category": "AddonAzureBackupProtectedInstance", "enabled": true}
  ]'

DR Metrics Summary ​

MetricTargetConfiguration
RPO (Recovery Point Objective)15 minutesASR continuous replication
RTO (Recovery Time Objective)< 2 hoursRecovery plan automation
Backup FrequencyDaily2:00 AM UTC
Retention - Daily30 days
Retention - Weekly12 weeks
Retention - Monthly12 months
Retention - Yearly3 years

Cleanup ​

bash
# Disable replication first
# Portal: Replicated items β†’ Select VMs β†’ Disable replication

# Stop backup protection
az backup protection disable \
  --resource-group $RG_PRIMARY \
  --vault-name rsv-primary-eastus \
  --container-name $WEB_CONTAINER \
  --item-name vm-web-01 \
  --delete-backup-data true \
  --yes

# Delete resource groups
az group delete --name $RG_PRIMARY --yes --no-wait
az group delete --name $RG_DR --yes --no-wait

echo "Cleanup initiated"

Key Takeaways ​

  1. RPO vs RTO: Understand business requirements
  2. GRS Storage: Automatic geo-replication of backups
  3. ASR: Near-zero RPO with continuous replication
  4. Recovery Plans: Orchestrated failover with automation
  5. Test Failovers: Regular DR testing is essential
  6. Cross-Region Restore: Recover anywhere, anytime

Next Steps ​

  • Implement Azure Backup for Azure SQL
  • Configure backup for Azure Files
  • Set up Azure Automation runbooks for DR
  • Implement Azure Traffic Manager for DNS failover

Released under the MIT License.