Project 9: Azure Monitor & Log Analytics β
Overview β
Build a comprehensive monitoring solution using Azure Monitor, Log Analytics, alerts, and dashboards. Master KQL queries for the AZ-104 exam and real-world troubleshooting.
Difficulty: Intermediate
Duration: 4-5 hours
Cost: ~$20-50/month (Log Analytics data ingestion)
Exam Weight: Monitor & Maintain domain (10-15%)
Architecture Diagram β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DATA SOURCES β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββ β
β β Azure VMs β β App β β Azure β β Custom β β
β β (Windows/ β β Services β β Resources β β Applications β β
β β Linux) β β β β (SQL, etc) β β β β
β β β β β β β β β β
β β Agents: β β Built-in β β Diagnostic β β Application β β
β β - AMA β β Telemetry β β Settings β β Insights SDK β β
β β - Legacy β β β β β β β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββββββ¬ββββββββββββ β
β β β β β β
β βββββββββββββββββββ΄ββββββββββββββββββ΄βββββββββββββββββββββββ β
β β β
βββββββββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AZURE MONITOR PLATFORM β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β LOG ANALYTICS WORKSPACE β β
β β (law-monitoring-lab) β β
β β β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β DATA TABLES β β β
β β β β β β
β β β ββββββββββββββ ββββββββββββββ ββββββββββββββ βββββββββββββββββββββββ β β
β β β β Heartbeat β β Perf β β Event β β Syslog ββ β β
β β β β β β β β (Windows) β β (Linux) ββ β β
β β β β VM health β β CPU, Mem, β β Security, β β Auth, daemon ββ β β
β β β β Agent conn β β Disk, Net β β System β β logs ββ β β
β β β ββββββββββββββ ββββββββββββββ ββββββββββββββ βββββββββββββββββββββββ β β
β β β β β β
β β β ββββββββββββββ ββββββββββββββ ββββββββββββββ βββββββββββββββββββββββ β β
β β β β AzureActivityβ β AzureDiag β β AppTraces β β Custom Logs ββ β β
β β β β β β β β β β ββ β β
β β β β Control β β Resource β β App β β Your custom ββ β β
β β β β plane ops β β diagnosticsβ β telemetry β β data ββ β β
β β β ββββββββββββββ ββββββββββββββ ββββββββββββββ βββββββββββββββββββββββ β β
β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β β KQL QUERY ENGINE β β
β β βββββββββββββββββββββββββββ β β
β β β Perf β β β
β β β | where ObjectName == β β β
β β β "Processor" β β β
β β β | summarize avg(...) β β β
β β β | render timechart β β β
β β βββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β METRICS β β
β β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββββββββββββββ β β
β β β Platform β β Guest OS β β Custom β β β
β β β Metrics β β Metrics β β Metrics β β β
β β β β β β β β β β
β β β VM: CPU %, β β Memory %, β β Application-specific β β β
β β β Network I/O, β β Disk Queue, β β metrics via β β β
β β β Disk IOPS β β Available RAM β β Application Insights β β β
β β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ALERTS & ACTIONS β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β ALERT RULES β β
β β β β
β β ββββββββββββββββββββββ ββββββββββββββββββββββ ββββββββββββββββββββββββ β β
β β β Metric Alert β β Log Alert β β Activity Log β β β
β β β β β β β Alert β β β
β β β "CPU > 80% for β β "Error count > β β "VM deleted" β β β
β β β 5 minutes" β β 10 in 5 min" β β β β β
β β β β β β β β β β
β β β Severity: 2 β β Severity: 1 β β Severity: 2 β β β
β β βββββββββββ¬βββββββββββ βββββββββββ¬βββββββββββ βββββββββββββ¬βββββββββββ β β
β β β β β β β
β ββββββββββββββΌββββββββββββββββββββββββΌββββββββββββββββββββββββββΌβββββββββββββ β
β β β β β
β βββββββββββββββββββββββββ΄ββββββββββββββββββββββββββ β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β ACTION GROUPS β β
β β β β
β β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β β
β β β Email/SMS β β Azure β β Webhook β β β
β β β Notifications β β Function β β (Teams/Slack) β β β
β β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β β
β β β β
β β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β β
β β β Logic App β β ITSM β β Runbook β β β
β β β Workflow β β Integration β β (Automation) β β β
β β ββββββββββββββββββ ββββββββββββββββββ ββββββββββββββββββ β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β VISUALIZATION β
β β
β βββββββββββββββββββββββ βββββββββββββββββββββββ βββββββββββββββββββββββββββ β
β β Azure Dashboards β β Workbooks β β Power BI β β
β β β β β β β β
β β Pin charts from β β Interactive β β Advanced analytics β β
β β Metrics Explorer β β reports with β β and reporting β β
β β and Log Analytics β β KQL queries β β β β
β βββββββββββββββββββββββ βββββββββββββββββββββββ βββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββWhat You'll Learn β
- Create Log Analytics workspace
- Configure data collection rules
- Install Azure Monitor Agent (AMA)
- Write KQL queries for troubleshooting
- Create metric and log alerts
- Build Azure dashboards
- Configure diagnostic settings
Phase 1: Create Monitoring Infrastructure β
Step 1.1: Create Resource Group and Workspace β
bash
# Set variables
LOCATION="eastus"
RG_NAME="rg-monitoring-lab"
WORKSPACE_NAME="law-monitoring-lab"
# Create resource group
az group create \
--name $RG_NAME \
--location $LOCATION \
--tags Project=Monitoring Environment=Lab
# Create Log Analytics Workspace
az monitor log-analytics workspace create \
--resource-group $RG_NAME \
--workspace-name $WORKSPACE_NAME \
--location $LOCATION \
--retention-time 30
# Get workspace ID and key
WORKSPACE_ID=$(az monitor log-analytics workspace show \
--resource-group $RG_NAME \
--workspace-name $WORKSPACE_NAME \
--query customerId -o tsv)
WORKSPACE_KEY=$(az monitor log-analytics workspace get-shared-keys \
--resource-group $RG_NAME \
--workspace-name $WORKSPACE_NAME \
--query primarySharedKey -o tsv)
echo "Workspace ID: $WORKSPACE_ID"
echo "Workspace created: $WORKSPACE_NAME"Step 1.2: Create Test VMs for Monitoring β
bash
# Create Windows VM
az vm create \
--resource-group $RG_NAME \
--name vm-windows-mon \
--image Win2022Datacenter \
--size Standard_D2s_v3 \
--admin-username azureadmin \
--admin-password 'P@ssw0rd123!Complex' \
--public-ip-sku Standard \
--no-wait
# Create Linux VM
az vm create \
--resource-group $RG_NAME \
--name vm-linux-mon \
--image Ubuntu2204 \
--size Standard_D2s_v3 \
--admin-username azureadmin \
--generate-ssh-keys \
--public-ip-sku Standard \
--no-wait
# Wait for VMs
az vm wait --resource-group $RG_NAME --name vm-windows-mon --created
az vm wait --resource-group $RG_NAME --name vm-linux-mon --created
echo "Test VMs created"Phase 2: Configure Data Collection β
Step 2.1: Create Data Collection Rule β
bash
# Create Data Collection Rule for Windows
az monitor data-collection rule create \
--resource-group $RG_NAME \
--name dcr-windows-perf \
--location $LOCATION \
--data-flows '[{
"destinations": ["la-workspace"],
"streams": ["Microsoft-Perf", "Microsoft-Event"]
}]' \
--destinations '{
"logAnalytics": [{
"name": "la-workspace",
"workspaceResourceId": "/subscriptions/'$(az account show --query id -o tsv)'/resourceGroups/'$RG_NAME'/providers/Microsoft.OperationalInsights/workspaces/'$WORKSPACE_NAME'"
}]
}' \
--data-sources '{
"performanceCounters": [{
"name": "perfCounterDataSource",
"streams": ["Microsoft-Perf"],
"samplingFrequencyInSeconds": 60,
"counterSpecifiers": [
"\\Processor(_Total)\\% Processor Time",
"\\Memory\\Available MBytes",
"\\Memory\\% Committed Bytes In Use",
"\\LogicalDisk(_Total)\\% Free Space",
"\\LogicalDisk(_Total)\\Disk Reads/sec",
"\\LogicalDisk(_Total)\\Disk Writes/sec"
]
}],
"windowsEventLogs": [{
"name": "eventLogsDataSource",
"streams": ["Microsoft-Event"],
"xPathQueries": [
"Application!*[System[(Level=1 or Level=2 or Level=3)]]",
"System!*[System[(Level=1 or Level=2 or Level=3)]]",
"Security!*[System[(band(Keywords,13510798882111488))]]"
]
}]
}'
echo "Data Collection Rule created"Step 2.2: Install Azure Monitor Agent β
bash
# Install AMA on Windows VM
az vm extension set \
--resource-group $RG_NAME \
--vm-name vm-windows-mon \
--name AzureMonitorWindowsAgent \
--publisher Microsoft.Azure.Monitor \
--enable-auto-upgrade true
# Install AMA on Linux VM
az vm extension set \
--resource-group $RG_NAME \
--vm-name vm-linux-mon \
--name AzureMonitorLinuxAgent \
--publisher Microsoft.Azure.Monitor \
--enable-auto-upgrade true
echo "Azure Monitor Agent installed on both VMs"Step 2.3: Associate DCR with VMs β
bash
# Get VM resource IDs
WINDOWS_VM_ID=$(az vm show -g $RG_NAME -n vm-windows-mon --query id -o tsv)
LINUX_VM_ID=$(az vm show -g $RG_NAME -n vm-linux-mon --query id -o tsv)
# Get DCR ID
DCR_ID=$(az monitor data-collection rule show \
--resource-group $RG_NAME \
--name dcr-windows-perf \
--query id -o tsv)
# Create association for Windows VM
az monitor data-collection rule association create \
--name "windows-dcr-association" \
--resource $WINDOWS_VM_ID \
--rule-id $DCR_ID
echo "DCR associated with VMs"Phase 3: KQL Queries for Monitoring β
Step 3.1: Basic KQL Queries β
Run these queries in Log Analytics (Azure Portal β Monitor β Logs):
kusto
// Query 1: Check connected VMs (Heartbeat)
Heartbeat
| where TimeGenerated > ago(1h)
| summarize LastHeartbeat = max(TimeGenerated) by Computer
| project Computer, LastHeartbeat,
Status = iff(LastHeartbeat < ago(5m), "Disconnected", "Connected")
// Query 2: CPU Performance over time
Perf
| where ObjectName == "Processor"
| where CounterName == "% Processor Time"
| where InstanceName == "_Total"
| where TimeGenerated > ago(1h)
| summarize AvgCPU = avg(CounterValue) by bin(TimeGenerated, 5m), Computer
| render timechart
// Query 3: Memory Usage
Perf
| where ObjectName == "Memory"
| where CounterName == "% Committed Bytes In Use"
| where TimeGenerated > ago(1h)
| summarize AvgMemory = avg(CounterValue) by bin(TimeGenerated, 5m), Computer
| render timechart
// Query 4: Disk Free Space
Perf
| where ObjectName == "LogicalDisk"
| where CounterName == "% Free Space"
| where InstanceName != "_Total"
| where TimeGenerated > ago(1h)
| summarize MinFreeSpace = min(CounterValue) by Computer, InstanceName
| where MinFreeSpace < 20
| order by MinFreeSpace ascStep 3.2: Security and Event Queries β
kusto
// Query 5: Failed Login Attempts (Windows)
Event
| where EventLog == "Security"
| where EventID == 4625
| where TimeGenerated > ago(24h)
| summarize FailedLogins = count() by Computer, Account = tostring(EventData)
| order by FailedLogins desc
// Query 6: Windows Service Crashes
Event
| where EventLog == "System"
| where EventLevelName == "Error"
| where TimeGenerated > ago(24h)
| summarize Count = count() by Source, EventID
| order by Count desc
| take 10
// Query 7: Azure Activity - Resource Changes
AzureActivity
| where TimeGenerated > ago(24h)
| where OperationNameValue contains "write" or OperationNameValue contains "delete"
| project TimeGenerated, OperationNameValue, ResourceGroup,
Resource = tostring(split(ResourceId, "/")[-1]),
Caller, ActivityStatusValue
| order by TimeGenerated desc
// Query 8: VM Start/Stop Events
AzureActivity
| where TimeGenerated > ago(7d)
| where OperationNameValue in (
"Microsoft.Compute/virtualMachines/start/action",
"Microsoft.Compute/virtualMachines/deallocate/action",
"Microsoft.Compute/virtualMachines/powerOff/action"
)
| project TimeGenerated,
VM = tostring(split(ResourceId, "/")[-1]),
Operation = tostring(split(OperationNameValue, "/")[-2]),
Caller
| order by TimeGenerated descStep 3.3: Advanced KQL Queries β
kusto
// Query 9: Top 5 Processes by CPU (requires VM Insights)
InsightsMetrics
| where Namespace == "Processor"
| where Name == "UtilizationPercentage"
| where TimeGenerated > ago(1h)
| summarize AvgCPU = avg(Val) by Computer, tostring(Tags.["vm.azm.ms/process"])
| top 5 by AvgCPU desc
// Query 10: Network Traffic Analysis
Perf
| where ObjectName == "Network Adapter"
| where CounterName in ("Bytes Received/sec", "Bytes Sent/sec")
| where TimeGenerated > ago(1h)
| summarize
TotalReceived = sum(iff(CounterName == "Bytes Received/sec", CounterValue, 0)),
TotalSent = sum(iff(CounterName == "Bytes Sent/sec", CounterValue, 0))
by Computer, bin(TimeGenerated, 5m)
| render timechart
// Query 11: Anomaly Detection - Sudden CPU Spike
let threshold = 80;
let spike_threshold = 30;
Perf
| where ObjectName == "Processor"
| where CounterName == "% Processor Time"
| where InstanceName == "_Total"
| where TimeGenerated > ago(4h)
| summarize AvgCPU = avg(CounterValue) by bin(TimeGenerated, 5m), Computer
| serialize
| extend PreviousCPU = prev(AvgCPU, 1)
| extend Spike = AvgCPU - PreviousCPU
| where Spike > spike_threshold and AvgCPU > threshold
| project TimeGenerated, Computer, AvgCPU, Spike
// Query 12: Create Summary Report
Heartbeat
| where TimeGenerated > ago(24h)
| summarize
LastSeen = max(TimeGenerated),
HeartbeatCount = count()
by Computer, OSType, Version
| join kind=leftouter (
Perf
| where TimeGenerated > ago(1h)
| where ObjectName == "Processor" and CounterName == "% Processor Time" and InstanceName == "_Total"
| summarize AvgCPU = round(avg(CounterValue), 2) by Computer
) on Computer
| project Computer, OSType, LastSeen, HeartbeatCount, AvgCPU
| order by ComputerPhase 4: Create Alerts β
Step 4.1: Create Action Group β
bash
# Create action group for email notifications
az monitor action-group create \
--resource-group $RG_NAME \
--name ag-email-alerts \
--short-name EmailAlerts \
--action email admin-email admin@contoso.com
# Create action group with webhook (for Teams/Slack)
az monitor action-group create \
--resource-group $RG_NAME \
--name ag-webhook-alerts \
--short-name WebhookAlrt \
--action webhook teams-webhook "https://your-teams-webhook-url"
echo "Action groups created"Step 4.2: Create Metric Alert - High CPU β
bash
# Get VM resource ID
WINDOWS_VM_ID=$(az vm show -g $RG_NAME -n vm-windows-mon --query id -o tsv)
# Create CPU alert
az monitor metrics alert create \
--resource-group $RG_NAME \
--name alert-high-cpu \
--scopes $WINDOWS_VM_ID \
--condition "avg Percentage CPU > 80" \
--window-size 5m \
--evaluation-frequency 1m \
--action ag-email-alerts \
--severity 2 \
--description "Alert when CPU exceeds 80% for 5 minutes"
echo "CPU alert created"Step 4.3: Create Metric Alert - Low Disk Space β
bash
# Create disk space alert
az monitor metrics alert create \
--resource-group $RG_NAME \
--name alert-low-disk \
--scopes $WINDOWS_VM_ID \
--condition "avg OS Disk Used Percentage > 85" \
--window-size 15m \
--evaluation-frequency 5m \
--action ag-email-alerts \
--severity 1 \
--description "Alert when disk usage exceeds 85%"
echo "Disk alert created"Step 4.4: Create Log Alert - Error Events β
bash
# Get workspace resource ID
WORKSPACE_RESOURCE_ID=$(az monitor log-analytics workspace show \
--resource-group $RG_NAME \
--workspace-name $WORKSPACE_NAME \
--query id -o tsv)
# Create log alert for errors
az monitor scheduled-query create \
--resource-group $RG_NAME \
--name alert-windows-errors \
--scopes $WORKSPACE_RESOURCE_ID \
--condition "count > 10" \
--condition-query "Event | where EventLevelName == 'Error' | where TimeGenerated > ago(5m)" \
--evaluation-frequency 5m \
--window-size 5m \
--action-groups ag-email-alerts \
--severity 2 \
--description "Alert when more than 10 error events in 5 minutes"
echo "Log alert created"Step 4.5: Create Activity Log Alert β
bash
# Create alert for VM deletion
az monitor activity-log alert create \
--resource-group $RG_NAME \
--name alert-vm-deleted \
--action-group ag-email-alerts \
--condition category=Administrative \
--condition operationName="Microsoft.Compute/virtualMachines/delete" \
--scope "/subscriptions/$(az account show --query id -o tsv)" \
--description "Alert when any VM is deleted"
echo "Activity log alert created"Phase 5: Configure Diagnostic Settings β
Step 5.1: Enable VM Diagnostics β
bash
# Enable boot diagnostics
az vm boot-diagnostics enable \
--resource-group $RG_NAME \
--name vm-windows-mon
az vm boot-diagnostics enable \
--resource-group $RG_NAME \
--name vm-linux-mon
echo "Boot diagnostics enabled"Step 5.2: Configure Resource Diagnostics β
bash
# Get workspace resource ID
WORKSPACE_RESOURCE_ID=$(az monitor log-analytics workspace show \
--resource-group $RG_NAME \
--workspace-name $WORKSPACE_NAME \
--query id -o tsv)
# Enable diagnostics for VM
az monitor diagnostic-settings create \
--resource $WINDOWS_VM_ID \
--name "vm-diagnostics" \
--workspace $WORKSPACE_RESOURCE_ID \
--metrics '[{"category": "AllMetrics", "enabled": true}]'
echo "Diagnostic settings configured"Phase 6: Create Azure Dashboard β
Step 6.1: Create Dashboard via CLI β
bash
# Create a dashboard JSON
cat > dashboard.json << 'EOF'
{
"properties": {
"lenses": [
{
"order": 0,
"parts": [
{
"position": {"x": 0, "y": 0, "rowSpan": 4, "colSpan": 6},
"metadata": {
"type": "Extension/HubsExtension/PartType/MarkdownPart",
"inputs": [],
"settings": {
"content": {
"settings": {
"content": "# Monitoring Dashboard\n\nReal-time monitoring for AZ-104 Lab",
"title": "Overview",
"subtitle": "Monitoring Lab"
}
}
}
}
}
]
}
],
"metadata": {
"model": {
"timeRange": {
"value": {"relative": {"duration": 24, "timeUnit": 1}},
"type": "MsPortalFx.Composition.Configuration.ValueTypes.TimeRange"
}
}
}
},
"name": "monitoring-dashboard",
"type": "Microsoft.Portal/dashboards",
"location": "eastus",
"tags": {"hidden-title": "Monitoring Dashboard"}
}
EOF
# Create dashboard
az portal dashboard create \
--resource-group $RG_NAME \
--name monitoring-dashboard \
--input-path dashboard.json
echo "Dashboard created - customize in Azure Portal"Phase 7: Workbooks β
Step 7.1: Create Workbook via Portal β
Navigate to Azure Portal β Monitor β Workbooks β New
Add the following KQL visualizations:
Section 1: VM Health Overview
kusto
Heartbeat
| where TimeGenerated > ago(1h)
| summarize LastHeartbeat = max(TimeGenerated) by Computer, OSType
| extend Status = iff(LastHeartbeat > ago(5m), "Healthy", "Unhealthy")
| project Computer, OSType, LastHeartbeat, StatusSection 2: CPU Trend Chart
kusto
Perf
| where ObjectName == "Processor"
| where CounterName == "% Processor Time"
| where InstanceName == "_Total"
| where TimeGenerated > ago(4h)
| summarize AvgCPU = avg(CounterValue) by bin(TimeGenerated, 5m), Computer
| render timechartSection 3: Top Errors
kusto
Event
| where EventLevelName == "Error"
| where TimeGenerated > ago(24h)
| summarize Count = count() by Source, EventID
| top 10 by Count
| render barchartExam Preparation: Key KQL Concepts β
| Operator | Purpose | Example |
|---|---|---|
where | Filter rows | where TimeGenerated > ago(1h) |
summarize | Aggregate data | summarize count() by Computer |
project | Select columns | project Computer, CPU |
extend | Add calculated column | extend Status = iff(CPU > 80, "High", "Normal") |
join | Combine tables | `Table1 |
render | Visualize | render timechart |
top | Limit results | top 10 by Count |
order by | Sort results | order by TimeGenerated desc |
Cleanup β
bash
# Delete resource group
az group delete --name $RG_NAME --yes --no-wait
echo "Cleanup initiated"Key Takeaways β
- Azure Monitor Agent (AMA): Replaces legacy Log Analytics agent
- Data Collection Rules: Define what data to collect and where to send
- KQL: Essential for log queries and creating alerts
- Alert Types: Metric, Log, and Activity Log alerts
- Action Groups: Reusable notification configurations
- Workbooks: Interactive reports with KQL queries
Next Steps β
- Enable VM Insights for detailed performance monitoring
- Configure Network Watcher for network troubleshooting
- Set up Application Insights for application monitoring
- Create custom KQL functions for complex queries