Create an APM anomaly rule
Get alerts when either the latency, throughput, or failed transaction rate of a service is abnormal.
Required role
The Editor role or higher is required to create anomaly rules. To learn more, refer to Assign user roles and privileges.
You can create an anomaly rule to alert you when either the latency, throughput, or failed transaction rate of a service is abnormal. Anomaly rules can be set at different levels: environment, service, and/or transaction type. Add actions to raise alerts via services or third-party integrations (for example, send an email or create a Jira issue).
Tip
These steps show how to use the Alerts UI. You can also create an anomaly rule directly from any page within Applications. Click the Alerts and rules button, and select Create anomaly rule. When you create a rule this way, the Name and Tags fields will be prepopulated but you can still change these.
To create your anomaly rule:
- In your Observability project, go to Alerts.
- Select Manage Rules from the Alerts page, and select Create rule.
- Enter a Name for your rule, and any optional Tags for more granular reporting (leave blank if unsure).
- Select the APM Anomaly rule type.
- Select the appropriate Service, Type, and Environment (or leave ALL to include all options).
- Select the desired severity (critical, major, minor, warning) from Has anomaly with severity.
- Define the interval to check the rule (for example, check every 1 minute).
- (Optional) Set up Actions.
- Save your rule.
Add actions
You can extend your rules with actions that interact with third-party systems, write to logs or indices, or send user notifications. You can add an action to a rule at any time. You can create rules without adding actions, and you can also define multiple actions for a single rule.
To add actions to rules, you must first create a connector for that service (for example, an email or external incident management system), which you can then use for different rules, each with their own action frequency.
Connectors provide a central place to store connection information for services and integrations with third party systems. The following connectors are available when defining actions for alerting rules:
- D3 Security
- IBM Resilient
- Index
- Jira
- Microsoft Teams
- Opsgenie
- PagerDuty
- Server log
- ServiceNow ITOM
- ServiceNow ITSM
- ServiceNow SecOps
- Slack
- Swimlane
- Torq
- Webhook
- xMatters
Note
Some connector types are paid commercial features, while others are free. For a comparison of the Elastic subscription levels, go to the subscription page.
For more information on creating connectors, refer to Connectors.
After you select a connector, you must set the action frequency. You can choose to create a Summary of alerts on each check interval or on a custom interval. For example, you can send email notifications that summarize the new, ongoing, and recovered alerts every twelve hours.
Alternatively, you can set the action frequency to For each alert and specify the conditions each alert must meet for the action to run. For example, you can send an email only when the alert status changes to critical.
With the Run when menu you can choose if an action runs when the threshold for an alert is reached, or when the alert is recovered. For example, you can add a corresponding action for each state to ensure you are alerted when the rule is triggered and also when it recovers.
Use the default notification message or customize it. You can add more context to the message by clicking the Add variable icon and selecting from a list of available variables.
The following variables are specific to this rule type. You can also specify variables common to all rules.
context.alertDetailsUrl
Link to the alert troubleshooting view for further context and details. This will be an empty string if the
server.publicBaseUrl
is not configured.context.environment
The transaction type the alert is created for.
context.reason
A concise description of the reason for the alert.
context.serviceName
The service the alert is created for.
context.threshold
Any trigger value above this value will cause the alert to fire.
context.transactionType
The transaction type the alert is created for.
context.triggerValue
The value that breached the threshold and triggered the alert.
context.viewInAppUrl
Link to the alert source.