We're using Postgresql 9.1.4
as our db server. I've been trying to speed up my test suite so I've stared profiling the db a bit to see exactly what's going on. We are using database_cleaner to truncate tables at the end of tests. YES I know transactions are faster, I can't use them in certain circumstances so I'm not concerned with that.
What I AM concerned with, is why TRUNCATION takes so long (longer than using DELETE) and why it takes EVEN LONGER on my CI server.
Right now, locally (on a Macbook Air) a full test suite takes 28 minutes. Tailing the logs, each time we truncate tables... ie:
TRUNCATE TABLE table1, table2 -- ... etc
it takes over 1 second to perform the truncation. Tailing the logs on our CI server (Ubuntu 10.04 LTS), take takes a full 8 seconds to truncate the tables and a build takes 84 minutes.
When I switched over to the :deletion
strategy, my local build took 20 minutes and the CI server went down to 44 minutes. This is a significant difference and I'm really blown away as to why this might be. I've tuned the DB on the CI server, it has 16gb system ram, 4gb shared_buffers... and an SSD. All the good stuff. How is it possible:
a. that it's SO much slower than my Macbook Air with 2gb of ram
b. that TRUNCATION is so much slower than DELETE when the postgresql docs state explicitly that it should be much faster.
Any thoughts?
This has come up a few times recently, both on SO and on the PostgreSQL mailing lists.
The TL;DR for your last two points:
(a) The bigger shared_buffers may be why TRUNCATE is slower on the CI server. Different fsync configuration or the use of rotational media instead of SSDs could also be at fault.
(b) TRUNCATE
has a fixed cost, but not necessarily slower than DELETE
, plus it does more work. See the detailed explanation that follows.
UPDATE: A significant discussion on pgsql-performance arose from this post. See this thread.
UPDATE 2: Improvements have been added to 9.2beta3 that should help with this, see this post.
Detailed explanation of TRUNCATE
vs DELETE FROM
:
While not an expert on the topic, my understanding is that TRUNCATE
has a nearly fixed cost per table, while DELETE
is at least O(n) for n rows; worse if there are any foreign keys referencing the table being deleted.
I always assumed that the fixed cost of a TRUNCATE
was lower than the cost of a DELETE
on a near-empty table, but this isn't true at all.
TRUNCATE table;
does more than DELETE FROM table;
The state of the database after a TRUNCATE table
is much the same as if you'd instead run:
DELETE FROM table;
VACCUUM (FULL, ANALYZE) table;
(9.0+ only, see footnote)... though of course TRUNCATE
doesn't actually achieve its effects with a DELETE
and a VACUUM
.
The point is that DELETE
and TRUNCATE
do different things, so you're not just comparing two commands with identical outcomes.
A DELETE FROM table;
allows dead rows and bloat to remain, allows the indexes to carry dead entries, doesn't update the table statistics used by the query planner, etc.
A TRUNCATE
gives you a completely new table and indexes as if they were just CREATE
ed. It's like you deleted all the records, reindexed the table and did a VACUUM FULL
.
If you don't care if there's crud left in the table because you're about to go and fill it up again, you may be better off using DELETE FROM table;
.
Because you aren't running VACUUM
you will find that dead rows and index entries accumulate as bloat that must be scanned then ignored; this slows all your queries down. If your tests don't actually create and delete all that much data you may not notice or care, and you can always do a VACUUM
or two part-way through your test run if you do. Better, let aggressive autovacuum settings ensure that autovacuum does it for you in the background.
You can still TRUNCATE
all your tables after the whole test suite runs to make sure no effects build up across many runs. On 9.0 and newer, VACUUM (FULL, ANALYZE);
globally on the table is at least as good if not better, and it's a whole lot easier.
IIRC Pg has a few optimisations that mean it might notice when your transaction is the only one that can see the table and immediately mark the blocks as free anyway. In testing, when I've wanted to create bloat I've had to have more than one concurrent connection to do it. I wouldn't rely on this, though.
DELETE FROM table;
is very cheap for small tables with no f/k refs
To DELETE
all records from a table with no foreign key references to it, all Pg has to do a sequential table scan and set the xmax
of the tuples encountered. This is a very cheap operation - basically a linear read and a semi-linear write. AFAIK it doesn't have to touch the indexes; they continue to point to the dead tuples until they're cleaned up by a later VACUUM
that also marks blocks in the table containing only dead tuples as free.
DELETE
only gets expensive if there are lots of records, if there are lots of foreign key references that must be checked, or if you count the subsequent VACUUM (FULL, ANALYZE) table;
needed to match TRUNCATE
's effects within the cost of your DELETE
.
In my tests here, a DELETE FROM table;
was typically 4x faster than TRUNCATE
at 0.5ms vs 2ms. That's a test DB on an SSD, running with fsync=off
because I don't care if I lose all this data. Of course, DELETE FROM table;
isn't doing all the same work, and if I follow up with a VACUUM (FULL, ANALYZE) table;
it's a much more expensive 21ms, so the DELETE
is only a win if I don't actually need the table pristine.
TRUNCATE table;
does a lot more fixed-cost work and housekeeping than DELETE
By contrast, a TRUNCATE
has to do a lot of work. It must allocate new files for the table, its TOAST table if any, and every index the table has. Headers must be written into those files and the system catalogs may need updating too (not sure on that point, haven't checked). It then has to replace the old files with the new ones or remove the old ones, and has to ensure the file system has caught up with the changes with a synchronization operation - fsync() or similar - that usually flushes all buffers to the disk. I'm not sure whether the the sync is skipped if you're running with the (data-eating) option fsync=off
.
I learned recently that TRUNCATE
must also flush all PostgreSQL's buffers related to the old table. This can take a non-trivial amount of time with huge shared_buffers
. I suspect this is why it's slower on your CI server.
The balance
Anyway, you can see that a TRUNCATE
of a table that has an associated TOAST table (most do) and several indexes could take a few moments. Not long, but longer than a DELETE
from a near-empty table.
Consequently, you might be better off doing a DELETE FROM table;
.
--
Note: on DBs before 9.0, CLUSTER table_id_seq ON table; ANALYZE table;
or VACUUM FULL ANALYZE table; REINDEX table;
would be a closer equivalent to TRUNCATE
. The VACUUM FULL
impl changed to a much better one in 9.0.