Observability
What to monitor, how to check health, and what the logs tell you.
Health check
The container runs a built-in health check every 10 seconds:
pgagroal-cli -c /etc/pgagroal/pgagroal.conf pingExit code 0 means the pooler daemon is running and accepting management commands. This is what Docker and Kubernetes use to determine container health.
What the health check does not tell you
The ping command checks the pgagroal process, not the PostgreSQL backend. A healthy pooler with an unreachable backend will still pass the health check. This is by design — the pooler should stay running so it can recover when the backend returns. To verify end-to-end connectivity, run an actual query through the pooler.
Logging
The container logs to stdout (console mode). All output goes to the container's log stream, which Docker and Kubernetes collect automatically.
# View logs
docker logs pgagroal
# Follow logs
docker logs -f pgagroal
# Kubernetes
kubectl logs -f deployment/pgagroal -n pgagroalLog levels
| Level | When to use |
|---|---|
| fatal | Only unrecoverable errors. Very quiet. |
| error | Errors that affect connections but the daemon continues. |
| warn | Recommended for production. Catches problems without noise. |
| info | Default. Includes startup messages and connection events. |
| debug1–debug5 | Increasing verbosity. Use temporarily for troubleshooting. |
| trace | Wire-level detail. Extremely verbose. Never in production. |
Set via PGAGROAL_LOG_LEVEL=warn. You can change the level and restart the container without losing connections — clients reconnect through the pool.
What to monitor
Even without a metrics endpoint, you can assess pool health by watching a few key indicators:
| Signal | Healthy | Problem |
|---|---|---|
| Connection errors in app logs | None | "connection refused" or timeouts suggest pool exhaustion |
| Client connection latency | < 1 ms to acquire | Consistently > 5 ms means clients are waiting for a free connection |
| Backend connection count | Well below MAX_CONNECTIONS | At or near limit means pool is saturated |
| Container restarts | Zero | Frequent restarts indicate OOM or liveness probe failures |
| pgagroal-cli ping | Exit code 0 | Non-zero means daemon is down or unresponsive |
Prometheus metrics
pgagroal can expose a Prometheus-compatible metrics endpoint. In the Helm chart, enable it with metrics.enabled: true and metrics.port: 9187.
# Prometheus scrape config
scrape_configs:
- job_name: pgagroal
static_configs:
- targets: ["pgagroal.pgagroal.svc.cluster.local:9187"]
scrape_interval: 15sThe metrics endpoint is disabled by default. Enable it only when you have a Prometheus instance ready to scrape it.
Healthy vs unhealthy
Healthy pool
- -- pgagroal-cli ping returns 0
- -- Connections acquire in < 1 ms
- -- Backend connection count is stable and below limit
- -- No authentication errors in logs
- -- Zero container restarts
Unhealthy pool
- -- Applications report "connection refused" or timeouts
- -- Backend connection count is at MAX_CONNECTIONS
- -- Logs show repeated connection errors or auth failures
- -- Container is restarting (check liveness probe logs)
- -- Queries succeed but with unusually high latency
See also: Troubleshooting for diagnosing unhealthy pools.