Monitoring Dashboard
The monitoring dashboard is the central place to assess the health of your data platform. It combines pipeline status, connector health, infrastructure state, and active alerts into a single view so you can identify and respond to problems without switching between pages.
What the dashboard shows
The dashboard is organized into summary panels, each covering a different area of your stack:
- Active alerts — the count of unacknowledged alerts, broken down by severity (critical, warning, info). Clicking a severity badge takes you to the filtered alert list.
- Pipeline status — a summary of recent pipeline runs: succeeded, failed, running, and queued. Failed runs are highlighted and link to the run detail page.
- Connector health — the status of each connector in the current project. Connectors in an errored state are listed with the timestamp and reason for the most recent failure.
- Infrastructure state — the reconciliation status of managed resources. Resources with drift (where actual state differs from desired state) are flagged.
- Data volume trends — a time-series chart showing total rows ingested per day, overlaid with the rolling 30-day average used by anomaly detection.
Key metrics
The top of the dashboard displays four headline numbers:
| Metric | What it measures |
|---|---|
| Active alerts | Alerts that have not been acknowledged or resolved |
| Failed runs (24h) | Pipeline and connector runs that failed in the last 24 hours |
| Data volume (24h) | Total rows ingested across all connectors in the last 24 hours |
| Drift detected | Infrastructure resources where actual state does not match desired state |
These metrics refresh automatically every 60 seconds, matching the alert evaluation frequency.
Time range selection
The dashboard defaults to showing data from the last 24 hours. You can select a different time range from the dropdown in the top-right corner:
- Last 1 hour
- Last 6 hours
- Last 24 hours (default)
- Last 7 days
- Last 30 days
- Custom range
Changing the time range updates all panels and charts on the dashboard. The time range persists within your session but resets to 24 hours when you return to the dashboard later.
Filtering
You can filter the dashboard to focus on specific areas:
By resource type
Select one or more resource types to show only alerts and status information related to those resources:
- Snowpipe
- dbt job
- Connector run
- Warehouse
- Pipeline
By severity
Filter alerts by severity level: critical, warning, or info. This applies to both the alert count and the alert list panel.
By status
Filter alerts by their current status:
- Firing — the alert condition is currently met and the alert has not been resolved
- Acknowledged — someone has marked the alert as seen, pausing escalation
- Resolved — the alert condition is no longer met, or someone has manually resolved it
Filters can be combined. For example, you can view only critical Snowpipe alerts that are currently firing.
Quick actions
The dashboard supports quick actions so you can respond to issues without navigating away:
Acknowledge an alert
Click the acknowledge button on any alert row to mark it as seen. This pauses the escalation policy for that alert. You can add an optional note explaining what you are doing about it.
Retry a failed run
For failed pipeline or connector runs, click the retry button to re-execute the run with the same configuration. The retry appears as a new run in the history and does not overwrite the failed run’s log.
Resolve an alert
Manually resolve an alert if you have addressed the underlying issue and do not want to wait for the next evaluation cycle to clear it automatically.
Dashboard and DuckDB
The monitoring dashboard is backed by Rime’s embedded DuckDB analytics engine. Metrics are collected from Snowflake, dbt, S3, and infrastructure systems by background pollers, then stored in DuckDB for fast aggregation and time-series queries. This means the dashboard loads quickly even when querying across weeks of historical data.
Permissions
All users in a project can view the monitoring dashboard. Acknowledging and resolving alerts requires at least the Editor role. Retrying failed runs requires the same permissions as running the pipeline or connector that failed.
Next steps
- Set up alert rules to define what conditions trigger alerts
- Configure notification channels to receive alerts outside the dashboard
- Review anomaly detection for automatic volume monitoring