Anomaly Detection
Anomaly detection monitors your data volumes for unexpected changes. Rather than requiring you to set fixed thresholds for every table and connector, anomaly detection learns normal patterns from historical data and alerts you when volumes deviate significantly from what is expected.
How it works
Rime maintains a rolling 30-day average of data volume for each monitored resource. On every evaluation cycle, the current value is compared against this average. If the deviation exceeds the configured threshold (50% by default), an anomaly alert is generated.
The calculation:
- Rime records the row count for each connector run and the data volume for each table load.
- The rolling average is computed over the previous 30 days of successful runs for the same resource.
- On each evaluation, the most recent value is compared to the rolling average.
- If
|current - average| / average > threshold, an anomaly alert fires.
Both over-volume and under-volume anomalies are detected. A sudden spike (e.g., a table that normally loads 10,000 rows suddenly loads 50,000) and a sudden drop (e.g., a connector that usually extracts 5,000 rows returns 200) both trigger alerts.
What is monitored
Anomaly detection tracks two categories of metrics:
Row counts per connector
Each connector run records the number of rows extracted per table. The anomaly detector tracks the total row count across all tables for each connector. This catches problems like:
- A source system purging data, causing a dramatic drop in extracted rows
- A misconfigured filter returning far more data than expected
- A connector pulling duplicate data due to a cursor reset
Data volume per table
Each table load into Snowflake records the row count and file size. The anomaly detector tracks per-table volumes. This catches problems like:
- A source table growing unexpectedly due to a data generation bug
- A table receiving no new data when it normally receives steady inserts
- Schema changes causing rows to be split or merged
Seasonal trend awareness
Simple rolling averages can generate false positives for data that follows predictable patterns. For example, an e-commerce company might see 3x more orders on weekdays than weekends, or a finance team might see a spike on the last business day of each month.
Rime’s anomaly detection accounts for two common seasonal patterns:
Day-of-week patterns
The rolling average is calculated separately for each day of the week. Monday’s volume is compared to the average of previous Mondays, Tuesday to previous Tuesdays, and so on. This prevents a normal Monday spike from triggering an anomaly alert that would fire if compared to a weekend baseline.
Month-end patterns
If the system detects that volumes on the last three business days of each month are consistently higher than mid-month volumes, it adjusts the baseline for those days. This is common in financial data, payroll processing, and reporting workloads.
Seasonal adjustment requires at least 30 days of history. Until that history accumulates, anomaly detection uses a simple rolling average without seasonal adjustments.
Configuring sensitivity
The default deviation threshold is 50%, meaning an anomaly alert fires when the current value is more than 50% above or below the rolling average. You can adjust this per resource:
- Lower threshold (e.g., 20%) — more sensitive, catches smaller deviations. Useful for critical data sources where even modest changes warrant investigation.
- Higher threshold (e.g., 100%) — less sensitive, only catches dramatic swings. Useful for data sources with naturally high variance.
To change the threshold, navigate to the resource’s detail page (connector or table) and find the Anomaly Detection section. Set the threshold as a percentage.
You can also disable anomaly detection for specific resources if they are inherently unpredictable and the alerts are not useful.
Anomaly alerts
When an anomaly is detected, an alert is created with:
- Severity: Warning (by default). You can change this in the anomaly detection settings.
- Rule name: “Volume anomaly: {resource name}”
- Details: The expected value (rolling average), the actual value, and the percentage deviation.
Anomaly alerts follow the same lifecycle as all other alerts. They appear in the monitoring dashboard, are sent to notification channels, and flow through escalation policies if configured.
Anomaly alerts auto-resolve when the next evaluation cycle shows the volume is back within the acceptable range.
Investigation
When you receive an anomaly alert, the alert detail page provides context to help you understand what happened:
Volume history chart
A time-series chart showing the resource’s volume over the past 30 days, with the rolling average overlaid. The anomalous data point is highlighted, making it easy to see whether this is a one-time spike or part of a trend.
Drill-down to tables
For connector-level anomalies, you can drill down to see which individual tables contributed to the volume change. The table breakdown shows each table’s expected vs. actual row count, sorted by the magnitude of deviation.
Correlated alerts
The investigation view shows other alerts that fired around the same time. If a volume anomaly coincides with a connector error or a pipeline failure, the correlation can point to the root cause.
Run history
A link to the connector’s run history or the table’s load history lets you compare the anomalous run with previous successful runs.
Evaluation schedule
Anomaly detection runs on a separate schedule from standard alert rules. The anomaly evaluation cycle runs once per hour by default. This is less frequent than the 60-second alert evaluation cycle because anomaly detection operates on aggregate daily volumes rather than real-time metrics.
The hourly cycle is not configurable in the current release.
Data requirements
Anomaly detection requires a minimum amount of historical data before it can calculate meaningful baselines:
| Requirement | Minimum |
|---|---|
| Days of history | 7 days (30 days for seasonal adjustments) |
| Successful runs | At least 5 data points in the rolling window |
Resources that do not meet these minimums are excluded from anomaly detection. They appear in the anomaly detection settings with a “Collecting baseline” status.
Next steps
- Review your anomaly alerts on the monitoring dashboard
- Adjust thresholds for high-variance resources that generate false positives
- Combine anomaly detection with alert rules for comprehensive monitoring coverage