Skip to main content

Troubleshooting

Problem, cause, fix. In that order.

Connection exhaustion

Symptom

Applications get "connection refused" or time out after 30 seconds waiting for a connection.

Cause

All backend connections are in use. The pool is full and the blocking_timeout (30s default) expired before a connection became available.

Fix

  • -- Check for long-running transactions holding connections open
  • -- Check for connection leaks in application code (connections not returned to the pool)
  • -- Increase MAX_CONNECTIONS only if the above are ruled out
  • -- Verify PostgreSQL's own max_connections is not the bottleneck

Increasing pool size without fixing the root cause just delays the problem. If the pool fills up consistently, the issue is almost always in application code.

High latency despite pooler

Symptom

Queries through the pooler are slower than expected. Adding the pooler made things worse, not better.

Cause

Several possibilities, in order of likelihood:

  • -- The queries themselves are slow (pooler does not fix slow queries)
  • -- All pool connections are busy and clients are waiting for one to free up
  • -- Network hop between application and pooler is adding latency
  • -- Pool size is too small for the concurrency level

Fix

  • -- Run the same query directly against PostgreSQL to isolate pooler overhead from query time
  • -- Check if connections are waiting (application-side connection acquire time)
  • -- Deploy the pooler as close to the application as possible (same node, same VPC)
  • -- Review MAX_CONNECTIONS relative to actual concurrent query load

Incorrect connection sizing

Symptom

PostgreSQL reports "too many connections" even though the pooler is running. Or: the pool has hundreds of idle connections that are never used.

Cause

MAX_CONNECTIONSin the pooler exceeds PostgreSQL's own max_connections. Or: multiple pooler replicas each open their full pool size, and the total exceeds the backend limit.

Fix

  • -- Calculate: replicas × MAX_CONNECTIONS ≤ PostgreSQL max_connections
  • -- Leave headroom for admin connections and monitoring tools
  • -- Start with 25–50 per replica and increase only when connections are consistently waiting

Example: 2 replicas × 50 = 100 backend connections. If PostgreSQL has max_connections = 100, there is no room for anything else. Set it to max_connections = 120 or reduce the pool size.

Misconfigured authentication

Symptom

Clients get "authentication failed" errors even though credentials work when connecting directly to PostgreSQL.

Cause

  • -- PG_USERNAME / PG_PASSWORD were set but do not match the PostgreSQL credentials
  • -- PostgreSQL requires scram-sha-256 but the pooler or client is attempting md5
  • -- PostgreSQL pg_hba.confrejects connections from the pooler's IP address

Fix

  • -- Use passthrough mode (no PG_USERNAME / PG_PASSWORD) unless you specifically need registered users
  • -- Verify PostgreSQL pg_hba.conf allows connections from the pooler container or pod IP range
  • -- Check that both client and server agree on the auth method

Container and network issues

Symptom

The container starts but clients cannot connect, or the container keeps restarting.

Common causes and fixes

ProblemFix
Container starts, clients get "connection refused"Port 6432 not published. Check -p 6432:6432 or Kubernetes Service.
Container restarts repeatedlyCheck logs: docker logs pgagroal. Usually a config error or OOM. Check resource limits.
Pooler starts but cannot reach PostgreSQLVerify PG_BACKEND_HOST is reachable from the container. Check DNS, network policies, security groups.
Connections work, then stop after backend restartpgagroal recovers automatically (typically within 60s). If it does not, check that the backend is fully ready before clients retry.
Health check fails but container seems fineThe ping command needs access to the Unix socket in /tmp. If the filesystem is misconfigured (e.g., missing emptyDir), the socket cannot be created.

Diagnostic checklist

When something is wrong, run through this in order:

  1. Is the container running? docker ps or kubectl get pods
  2. Is the daemon healthy? docker exec pgagroal pgagroal-cli -c /etc/pgagroal/pgagroal.conf ping
  3. What do the logs say? docker logs pgagroal
  4. Can the container reach PostgreSQL? docker exec pgagroal pg_isready -h $PG_BACKEND_HOST -p $PG_BACKEND_PORT
  5. Can a client connect through the pooler? psql -h localhost -p 6432 -U user -d db -c 'SELECT 1'

If step 1–3 pass but step 4 fails, the problem is between the pooler and PostgreSQL (network, DNS, firewall). If step 4 passes but step 5 fails, the problem is between the client and the pooler (port not published, auth mismatch).

See also: Configuration for pool sizing and timeouts, or Observability for monitoring pool health.

Run pgagroal

docker pull elevarq/pgagroal:1.0.0