Update-Heavy Tables (DB VACUUM)
Update-heavy tables are tables that include data that, based on a specific business need, is expected to be updated very frequently to remain relevant. Some good examples are clients’ bank account balance, the number of passengers on a train, available cinema tickets, the number of available taxi drivers circulating in a particular area, etc.
Update in PostgreSQL
To update a record PostgreSQL reviews the page with that record in search for free space of an appropriate size. If the space has been identified, the new record is entered onto the same page. If the space has yet to be found, the table is reviewed from top to bottom in search of a space of an appropriate size. If the space has been found, the new record will be placed there. If not, the data will be entered at the bottom of the table. If needed, a new data page is created. At the same time, for the previous version of the record, the system column xmax is set making the former record version invisible to newer transactions.
Vacuum in PostgreSQL
Under the MVCC (multi-version concurrency control) paradigm, PostgreSQL keeps previous versions of modified records. These versions still occupy space in that table although they aren’t visible for transactions. The purpose of VACUUM (and AUTOVACUUM) is to reinstate the space that is taken up by these outdated records so that it can be re-utilized. In short, the VACUUM process marks up the space that is taken up by the expired records as free and ready to be re-utilized. After the vacuum has done its work, the space doesn’t become free, it’s only available for reuse within a vacuumed table or index. Thus, it may seem that the size of the table or index has been reduced. In fact, after vacuuming the size is the same as before the vacuum.
HOT Updates
HOT (Heap Only Tuple) – the feature that overcomes some of the inefficiencies of how PostgreSQL handles UPDATEs PostgreSQL implements multiversioning by keeping the old version of the table row in the table – an UPDATE adds a new row version (“tuple”) of the row and marks the old version as invalid. In many respects, an UPDATE in PostgreSQL is not that different from a DELETE followed by an INSERT.
A Heap Only Tuple is a tuple that is not referenced from outside the table block. Instead, a “forwarding address” (its line pointer number) is stored in the old row version (pic). That only works if the new and the old version of the row are in the same block. The external address of the row (the original line pointer) remains unchanged. To access the heap-only tuple, PostgreSQL has to follow the “HOT chain” within the block. The advantage is that PostgreSQL doesn’t have to modify indexes. Since the external address of the tuple stays the same, the original index entry can still be used. Index scans follow the HOT chain to find the appropriate tuple. Dead tuples can be removed without the need for VACUUM. If there are several HOT updates on a single row, the HOT chain grows longer. Now any backend that processes a block and detects a HOT chain with dead tuples (even a SELECT!) will try to lock and reorganize the block, removing intermediate tuples. This is possible because there are no outside references to these tuples. This greatly reduces the need for VACUUM for UPDATE-heavy workloads.
Requirements for HOT updates are:
- there must be enough space in the block containing the updated row, and
- there is no index defined on any column whose value it modified.
Fillfactor
The fillfactor is a solution for the first HOT request. The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default. When a smaller fillfactor is specified, INSERT operations pack table pages only to the indicated percentage; the remaining space on each page is reserved for updating rows on that page. This gives UPDATE a chance to place the updated copy of a row on the same page as the original, which is more efficient than placing it on a different page. For a table whose entries are never updated, complete packing is the best choice, but in heavily updated tables smaller fillfactor is more appropriate.
Issue and Solution
We had a table of 14k+ records, where 1000 rows were updated every minute. It was a very important table since there were a lot of FKs around that table.


Our solution was to update 750 different records every minute. Important records more frequently.
Set autovacuum to run on 700 updates – after every update move, the update of indexes to separate transactions since it was not so important to be updated every minute.
Fillfactor – wasn't implemented since we already made great progress and data was already inserted. For the fillfactor to be applied, we need a full vacuum (block DB access for 45–240 minutes) or to migrate data to another table with already proper fillfactor and then insert data and change FKs from other tables.

Conclusion
Configuring PostgreSQL's autovacuum to run more frequently on update-heavy tables significantly improved performance by optimizing query execution. While adjusting the fillfactor could further enhance performance, we chose not to implement it to avoid the downtime associated with a full vacuum.
References:
- Data Egret. (2024, December 1). Vacuuming update-heavy tables.
- Data Egret. (2024, December 1). Accelerating VACUUM in Postgres.
- Cybertec. (2024, December 1). HOT updates in PostgreSQL for better performance.
- Cybertec. (2024, December 1). What is fillfactor and how does it affect PostgreSQL performance?