AIOps: Automatic Azure Alert Triage with Claude and Logic Apps

Azure Monitor can generate hundreds of alerts per day in a busy environment — many of them repetitive, low-priority, or duplicates of each other. Feeding those alerts into Claude via a Logic App workflow lets you automatically triage them, group related issues, and generate human-readable remediation suggestions before an engineer even looks at their phone.

Architecture Overview

The flow is straightforward: Azure Monitor fires an alert → Action Group calls a Logic App HTTP trigger → Logic App sends the alert payload to Claude via the Anthropic API → Claude returns a structured triage assessment → Logic App creates an enriched ticket in your ITSM and optionally sends a Teams message.

Setting Up the Logic App

Create a Consumption-tier Logic App with an HTTP trigger. Store your Anthropic API key in an Azure Key Vault and reference it via a managed identity so it never appears in the workflow definition.

# Create the Logic App and Key Vault secret via PowerShell
$rg = "rg-aiops"
$kvName = "kv-aiops"

New-AzLogicApp -ResourceGroupName $rg -Name "la-alert-triage" -Location "westeurope"

$secret = ConvertTo-SecureString $env:ANTHROPIC_API_KEY -AsPlainText -Force
Set-AzKeyVaultSecret -VaultName $kvName -Name "anthropic-key" -SecretValue $secret

The Triage Prompt

The quality of triage depends entirely on the system prompt. Give Claude the context it needs to make useful decisions:

system_prompt = """
You are an AIOps triage assistant for a Microsoft Azure environment.
When given an Azure Monitor alert, respond with JSON containing:
  severity: critical|high|medium|low
  likely_cause: one sentence explanation
  immediate_action: what to check first
  runbook: the most relevant runbook name from our library
  auto_resolvable: true if this commonly self-resolves within 10 minutes

Environment context:
- Production workloads run in westeurope and northeurope
- Business hours are 07:00-18:00 CET
- Critical = page on-call immediately; High = notify within 15 min
- Our runbook library: [DiskSpace-Cleanup, IIS-Restart, SQL-Failover, VM-Reboot]
"""

alert_message = f"""Alert Name: {alert["alertName"]}
Resource: {alert["resourceId"]}
Condition: {alert["condition"]["allOf"][0]["metricName"]} {alert["condition"]["allOf"][0]["operator"]} {alert["condition"]["allOf"][0]["threshold"]}
Fired At: {alert["firedDateTime"]}
Description: {alert.get("description", "N/A")}"""

Connecting to Your ITSM

Once Claude returns the structured JSON, the Logic App uses a switch action to route based on severity: Critical triggers a PagerDuty page, High creates a ServiceNow P2 incident with the triage notes pre-filled, and Medium/Low creates a ticket silently for morning review.

# Example: Parse Claude response and create ServiceNow incident
$triage = $claudeResponse | ConvertFrom-Json

if ($triage.severity -in @("critical","high")) {
    $incident = @{
        short_description = "$($alert.alertName) - $($triage.likely_cause)"
        description       = "Immediate action: $($triage.immediate_action)\nRunbook: $($triage.runbook)"
        urgency           = if ($triage.severity -eq "critical") { 1 } else { 2 }
        category          = "infrastructure"
    }
    Invoke-RestMethod -Uri $snowUrl -Method Post -Body ($incident|ConvertTo-Json) -Headers $snowHeaders
}

What This Saves in Practice

In a typical 200-VM environment this pattern reduces the number of alerts that require immediate human attention by 40-60%. The low-value noise gets silently ticketed; on-call engineers only get paged for events that genuinely need them. That is meaningful quality-of-life for whoever is carrying the pager on a Sunday night.

Summary

AI-powered alert triage is one of the highest-ROI applications of LLMs in IT operations. It requires almost no infrastructure change — just a Logic App between your existing alerting and ticketing systems — and starts delivering value the day you switch it on.

AIOps: Automatic Azure Alert Triage with Claude and Logic Apps

Architecture Overview

Setting Up the Logic App

The Triage Prompt

Connecting to Your ITSM

What This Saves in Practice

Summary

Submit a Comment Cancel reply

Search

Share this!

Articles

Topics