PgBouncer monitoring improvements in recent versions

Published in

okmeter.io blog

4 min readOct 15, 2018

As I wrote in my previous article “USE, RED and real world PgBouncer monitoring” there are some nice commands in PgBouncer’s admin interface that allow to collect stats how things going and spot problems, if you know where to look.

This post is about new stats added in these commands in new PgBouncer versions.

So as you know, SHOW STATS shows cumulative stats for each proxied DB:

Since PgBouncer version 1.8 there’s a couple of new columns in its output.

First one — total_xact_time — total number of microseconds spent by pgbouncer when connected to PostgreSQL in a transaction, either idle in transaction or executing queries.

This will allow us to chart db pool utilization in terms of time spent in transactions and compare it to the query time utilization:

We see two totally different situations — while database is used for serving queries only 5 to 25 % of the time, PgBouncer connections around 8:00 am spend up to 70% of time in transactions!

But this total_xact_time is useful in one more very important way.

There’s a known anti-pattern in Postgres usage, is when your application opens up a transaction, makes a query and then starts doing something else, for example some CPU-heavy calculation on that result or query to some other resource/service/database, while transaction keeps hangging. Later this app will probably return back to this transaction and might, for example, update something and commit it. The bad thing in that case is that there’s a corresponding Postgres backend process, that sits there doing nothing, while transaction is idling. And Postgres backends are somewhat expensive.

Your app should avoid such behavior.

This idle in transaction state might be monitored in the Postgres itself — there’s state column in pg_stat_activity system view. But pg_stat_activity provides us only with a snapshot of current states, that leads to possible false negative errors in reporting occurrences of such cases. Using PgBouncer's stats we can calculate a percentage of time that clients were performing some queries (total_query_time) from the total time spent by client while in transaction: total_xact_time. If we subtract that from 100% that will be idling percentage:

Moreover, there’s new two metrics in 1.8 version of PgBouncer that substitute original total_requests stat, that showed number of queries performed. With modern version of PgBouncer you’ll have total_query_count instead of total_requests, and additionally total_xact_count — that counts number of transactions.

So with that in hand, we can divide total_xact_time - total_query_time (total idling time) to the number of transactions —total_xact_count — this will give us how long on average each transaction is idling.

Furthermore with all that, we can characterize database workload in one more useful way: we can calculate average number of queries per transaction, by dividing the rate of queries by the rate of transactions. In okmeter monitoring you can do that as simple as this:

rate(
   metric(name="total_query_count", database="*")
) / rate(
   metric(name="total_xact_count)", database="*")
)

And here’s a corresponding chart:

We can see clearly when there were changes in workload profile.

Request Durations

As we saw in the previous article if you divide total_query_time by total_requests you’ll get average query duration. With newer PgBouncer versions these new stats you get —total_xact_time and total_wait_time can be charted in the same way — divided by the number of transactions and queries respectfully. This will produce a chart like this one:

This wait_time metric is way more handy in spotting pool saturation, than the one we discussed previous time, calculated from the number of waiting client in SHOW STATS output:

With all that and other detailed PgBouncer metrics and Postgres metrics you’ll be prepared to anything happening with your databases.

I hope you find this write up useful. I’ve tried to cover all the bases, if you feel that you have something to add — please, tell me, I’ll be glad to discuss.

We’re preparing next articles on Postgres and monitoring. So if you’re interested — follow our blog here, or at facebook or twitter to stay tuned!

Our monitoring service — okmeter.io will help you stay on-top of everything happening with you Postgresql, RDS and other infrastructure services.

PgBouncer monitoring improvements in recent versions

Request Durations

Written by pavel trukhanov