Idle postgres processes taking up lots of memory

mtalcott picture mtalcott · Dec 11, 2013 · Viewed 10.6k times · Source

I am trying to figure out why ~30 idle postgres processes take up so much process-specific memory after normal usage. I am using Postgres 9.3.1 and CentOS release 6.3 (Final). Using top, I can see that many of the postgres connections are using up to 300mb (average ~200mb) of non-shared (RES - SHR) memory:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 3534 postgres  20   0 2330m 1.4g 1.1g S  0.0 20.4   1:06.99 postgres: deploy mtalcott 10.222.154.172(53495) idle
 9143 postgres  20   0 2221m 1.1g 983m S  0.0 16.9   0:14.75 postgres: deploy mtalcott 10.222.154.167(35811) idle
 6026 postgres  20   0 2341m 1.1g 864m S  0.0 16.4   0:46.56 postgres: deploy mtalcott 10.222.154.167(37110) idle
18538 postgres  20   0 2327m 1.1g 865m S  0.0 16.1   2:06.59 postgres: deploy mtalcott 10.222.154.172(47796) idle
 1575 postgres  20   0 2358m 1.1g 858m S  0.0 15.9   1:41.76 postgres: deploy mtalcott 10.222.154.172(52560) idle

There are about 29 total idle connections. These idle connections keep growing in memory until the machine starts using swap, then performance grinds to a halt. As expected, resetting the connection clears the process-specific memory. The same number of connections on the same machine only use 20% of memory (with 0 swap) when I periodically reconnect. What kind of information are these processes holding on to? I would expect long-running, idle postgres processes to have similar memory usage to brand new, idle ones.

Worth noting: I am heavily using schemas. On every request to my app, I am setting and resetting search_path.

Answer

mtalcott picture mtalcott · Dec 17, 2013

What kind of information are these processes holding on to? I would expect long-running, idle postgres processes to have similar memory usage to brand new, idle ones.

There are actually quite a few things that Postgres will cache in local memory once it has loaded them:

  • relcache (relation descriptors)
  • catcache (system catalog entries)
  • compiled trees for plpgsql functions

For most use cases, all of these add up to a negligible amount. The key here was heavy usage of schemas and the effect on the relcache. This database contains ~500 schemas, each with the same ~90 tables. To Postgres, even though the schemas are all the same, this works out to 45,000 tables (500*90).

Each request cached some of the tables' relation descriptors in memory (most often in a different schema than the request before it), gradually filling up the relcache. Unfortunately, Postgres does not offer a way to limit the size of these caches, as the overhead would probably be counterproductive for most use cases.

Possible solutions:

  • Reconnect after a certain number of requests
  • Add more memory
  • Connection pooling to put a ceiling on the number of postgres connections using pgpool-II or PgBouncer

Thanks to Tom Lane and Merlin Moncure for help with this over the Postgres mailing lists.