Gp relsizes stats#1
Open
Vlasdislav wants to merge 5813 commits into
Open
Conversation
GetSnapshotData() set TransactionXmin = MyProc->xmin, but when SnapshotResetXmin() advanced MyProc->xmin, it did not advance TransactionXmin correspondingly. That meant that TransactionXmin could be older than MyProc->xmin, and XIDs between than TransactionXmin and the real MyProc->xmin could be vacuumed away. One known consequence is in pg_subtrans lookups: we might try to look up the status of an XID that was already truncated away. Back-patch to all supported versions. Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/d27a046d-a1e4-47d1-a95c-fbabe41debb4@iki.fi
The pgoutput module caches publication names in a list and frees it upon
invalidation. However, the code forgot to free the actual publication
names within the list elements, as publication names are pstrdup()'d in
GetPublication(). This would cause memory to leak in
CacheMemoryContext, bloating it over time as this context is not
cleaned.
This is a problem for WAL senders running for a long time, as an
accumulation of invalidation requests would bloat its cache memory
usage. A second case, where this leak is easier to see, involves a
backend calling SQL functions like pg_logical_slot_{get,peek}_changes()
which create a new decoding context with each execution. More
publications create more bloat.
To address this, this commit adds a new memory context within the
logical decoding context and resets it each time the publication names
cache is invalidated, based on a suggestion from Amit Kapila. This
ensures that the lifespan of the publication names aligns with that of
the logical decoding context.
Contrary to the HEAD-only commit f0c569d71515 that has changed
PGOutputData to track this new child memory context, the context is
tracked with a static variable whose state is reset with a MemoryContext
reset callback attached to PGOutputData->context, so as ABI
compatibility is preserved in stable branches. This approach is based
on an suggestion from Amit Kapila.
Analyzed-by: Michael Paquier, Jeff Davis
Author: Masahiko Sawada
Reviewed-by: Amit Kapila, Michael Paquier, Euler Taveira, Hou Zhijie
Discussion: https://postgr.es/m/Z0khf9EVMVLOc_YY@paquier.xyz
Backpatch-through: 13
This fixes "unresolved external symbol" errors with extensions that use functions from libcommon. This was reported with pgvector. Reported-by: Andrew Kane Author: Vladlen Popolitov Backpatch-through: 16, where Meson was introduced Discussion: https://www.postgresql.org/message-id/CAOdR5yF0krWrxycA04rgUKCgKugRvGWzzGLAhDZ9bzNv8g0Lag@mail.gmail.com
This fixes "unresolved external symbol" errors with extensions that use functions from libpgport that need special CFLAGS to compile. Currently, that includes the CRC-32 functions. Commit 2571c1d5cc did this for libcommon, but I missed that libpqport has the same issue. Reported-by: Tom Lane Backpatch-through: 16, where Meson was introduced Discussion: https://www.postgresql.org/message-id/CAOdR5yF0krWrxycA04rgUKCgKugRvGWzzGLAhDZ9bzNv8g0Lag@mail.gmail.com
Commit aac2c9b4fde889d13f859c233c2523345e72d32b mandated such locking and attempted to fulfill that mandate, but it missed REASSIGN OWNED. Hence, it remained possible to lose VACUUM's inplace update of datfrozenxid if a REASSIGN OWNED processed that database at the same time. This didn't affect the other inplace-updated catalog, pg_class. For pg_class, REASSIGN OWNED calls ATExecChangeOwner() instead of the generic AlterObjectOwner_internal(), and ATExecChangeOwner() fulfills the locking mandate. Like in GRANT, implement this by following the locking protocol for any catalog subject to the generic AlterObjectOwner_internal(). It would suffice to do this for IsInplaceUpdateOid() catalogs only. Back-patch to v13 (all supported versions). Kirill Reshke. Reported by Alexander Kukushkin. Discussion: https://postgr.es/m/CAFh8B=mpKjAy4Cuun-HP-f_vRzh2HSvYFG3rhVfYbfEBUhBAGg@mail.gmail.com
Cause parallel workers to not check datallowconn, rolcanlogin, and ACL_CONNECT privileges. The leader already checked these things (except for rolcanlogin which might have been checked for a different role). Re-checking can accomplish little except to induce unexpected failures in applications that might not even be aware that their query has been parallelized. We already had the principle that parallel workers rely on their leader to pass a valid set of authorization information, so this change just extends that a bit further. Also, modify the ReservedConnections, datconnlimit and rolconnlimit logic so that these limits are only enforced against regular backends, and only regular backends are counted while checking if the limits were already reached. Previously, background processes that had an assigned database or role were subject to these limits (with rather random exclusions for autovac workers and walsenders), and the set of existing processes that counted against each limit was quite haphazard as well. The point of these limits, AFAICS, is to ensure the availability of PGPROC slots for regular backends. Since all other types of processes have their own separate pools of PGPROC slots, it makes no sense either to enforce these limits against them or to count them while enforcing the limit. While edge-case failures of these sorts have been possible for a long time, the problem got a good deal worse with commit 5a2fed911 (CVE-2024-10978), which caused parallel workers to make some of these checks using the leader's current role where before we had used its AuthenticatedUserId, thus allowing parallel queries to fail after SET ROLE. The previous behavior was fairly accidental and I have no desire to return to it. This patch includes reverting 73c9f91a1, which was an emergency hack to suppress these same checks in some cases. It wasn't complete, as shown by a recent bug report from Laurenz Albe. We can also revert fd4d93d26 and 492217301, which hacked around the same problems in one regression test. In passing, remove the special case for autovac workers in CheckMyDatabase; it seems cleaner to have AutoVacWorkerMain pass the INIT_PG_OVERRIDE_ALLOW_CONNS flag, now that that does what's needed. Like 5a2fed911, back-patch to supported branches (which sadly no longer includes v12). Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
Before 728bd99, that has improved the support for 2PC files during recovery, the initial logic scanning files in pg_twophase was done so as files in the future of the transaction ID horizon were checked first, followed by a check if a transaction ID is aborted or committed which could involve a pg_xact lookup. After this commit, these checks have been done in reverse order. Files detected as in the future do not have a state that can be checked in pg_xact, hence this caused recovery to fail abruptly should an orphaned 2PC file in the future of the transaction ID horizon exist in pg_twophase at the beginning of recovery. A test is added to check for this scenario, using an empty 2PC with a transaction ID large enough to be in the future when running the test. This test is added in 16 and older versions for now. 17 and newer versions are impacted by a second bug caused by the addition of the epoch in the 2PC file names. An equivalent test will be added in these branches in a follow-up commit, once the second set of issues reported are fixed. Author: Vitaly Davydov, Michael Paquier Discussion: https://postgr.es/m/11e597-676ab680-8d-374f23c0@145466129 Backpatch-through: 13
pgoutput caches the attribute map of a relation, that is free()'d only
when validating a RelationSyncEntry. However, this code path is not
taken when calling any of the SQL functions able to do some logical
decoding, like pg_logical_slot_{get,peek}_changes(), leaking some memory
into CacheMemoryContext on repeated calls.
To address this, a relation's attribute map is allocated in
PGOutputData's cachectx, free()'d at the end of the execution of these
SQL functions when logical decoding ends. This is available down to 15.
v13 and v14 have a similar leak, which will be dealt with later.
Reported-by: Masahiko Sawada
Author: Vignesh C
Reviewed-by: Hou Zhijie
Discussion: https://postgr.es/m/CAD21AoDkAhQVSukOfH3_reuF-j4EU0-HxMqU3dU+bSTxsqT14Q@mail.gmail.com
Discussion: https://postgr.es/m/CALDaNm1hewNAsZ_e6FF52a=9drmkRJxtEPrzCB6-9mkJyeBBqA@mail.gmail.com
Backpatch-through: 15
Backpatch-through: 13
When looking up statistical data about an expression, we do not need to concern ourselves with the outer joins that could null the Vars/PHVs contained in the expression. Accounting for nullingrels in the expression could cause estimate_num_groups to count the same Var multiple times if it's marked with different nullingrels. This is incorrect, and could lead to "ERROR: corrupt MVNDistinct entry" when searching for multivariate n-distinct. Furthermore, the nullingrels could prevent us from matching an expression to expressional index columns or to the expressions in extended statistics, leading to inaccurate estimates. To fix, strip out all the nullingrels from the expression before we look up statistical data about it. There is one ensuing plan change in the regression tests, but it looks reasonable and does not compromise its original purpose. This patch could result in plan changes, but it fixes an actual bug, so back-patch to v16 where the outer-join-aware-Var infrastructure was introduced. Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs4-2Z4k+nFTiZe0Qbu5n8juUWenDAtMzi98bAZQtwHx0-w@mail.gmail.com
Slightly faulty logic in the original jsonb code (commit d9134d0) results in an empty top level array sorting less than a json null. We can't change the sort order now since it would affect btree indexes over jsonb, so document the anomaly. Backpatch to all live branches (13 .. 17) In master, also add a code comment noting the anomaly. Reported-by: Yan Chengpen Reviewed-by: Jian He Discussion: https://postgr.es/m/OSBPR01MB45199DD8DA2D1CECD50518188E272@OSBPR01MB4519.jpnprd01.prod.outlook.com
It's possible that external code is calling smgrtruncate(). Any external callers might like to consider the recent changes to RelationTruncate(), but commit 38c579b0 should not have changed the function prototype in the back-branches, per ABI stability policy. Restore smgrtruncate()'s traditional argument list in the back-branches, but make it a wrapper for a new function smgrtruncate2(). The three callers in core can use smgrtruncate2() directly. In master (18-to-be), smgrtruncate2() is effectively renamed to smgrtruncate(), so this wart is cleaned up. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CA%2BhUKG%2BThae6x6%2BjmQiuALQBT2Ae1ChjMh1%3DkMvJ8y_SBJZrvA%40mail.gmail.com
Commit 66aaabe7 (branches 13 - 17 only) was not acceptable to the Oracle Developer Studio compiler on build farm animal wrasse. It accidentally used a C++ style return statement to wrap a void function. None of the usual compilers complained, but it is right, that is not allowed in C. Fix. Reported-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/Z33vgfVgvOnbFLN9%40paquier.xyz
Change our ftruncate() macro to use the 64-bit variant of chsize(), and add a new macro to redirect lseek() to _lseeki64(). Back-patch to all supported releases, in preparation for a bug fix. Tested-by: Davinder Singh <davinder.singh@enterprisedb.com> Discussion: https://postgr.es/m/CAKZiRmyM4YnokK6Oenw5JKwAQ3rhP0YTz2T-tiw5dAQjGRXE3Q%40mail.gmail.com
walmethods.c used off_t to navigate around a pg_wal.tar file that could exceed 2GB, which doesn't work on Windows and would fail with misleading errors. Use pgoff_t instead. Back-patch to all supported branches. Author: Davinder Singh <davinder.singh@enterprisedb.com> Reported-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Discussion: https://postgr.es/m/CAKZiRmyM4YnokK6Oenw5JKwAQ3rhP0YTz2T-tiw5dAQjGRXE3Q%40mail.gmail.com
This error message stated the privileges required to add a member to a group even if the user was trying to drop a member: postgres=> alter group a drop user b; ERROR: permission denied to alter role DETAIL: Only roles with the ADMIN option on role "a" may add members. Since the required privileges for both operations are the same, we can fix this by modifying the message to mention both adding and dropping members: postgres=> alter group a drop user b; ERROR: permission denied to alter role DETAIL: Only roles with the ADMIN option on role "a" may add or drop members. Author: ChangAo Chen Reviewed-by: Tom Lane Discussion: https://postgr.es/m/tencent_FAA0D00E3514AAF0BBB6322542A6094FEF05%40qq.com Backpatch-through: 16
The ldapscheme option was missed when inspecing the HbaLine for assembling rows for the pg_hba_file_rules function. Backpatch to all supported versions. Author: Laurenz Albe <laurenz.albe@cybertec.at> Reported-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Bug: 18769 Discussion: https://postgr.es/m/18769-dd8610cbc0405172@postgresql.org Backpatch-through: v13
PLy_spi_execute_plan (PLyPlan.execute) and PLy_cursor_plan (plpy.cursor) use PLy_output_convert to convert Python values into Datums that can be passed to the query-to-execute. But they failed to pay much attention to its warning that it can leave "cruft generated along the way" behind. Repeated use of these methods can result in a substantial memory leak for the duration of the calling plpython function. To fix, make a temporary memory context to invoke PLy_output_convert in. This also lets us get rid of the rather fragile code that was here for retail pfree's of the converted Datums. Indeed, we don't need the PLyPlanObject.values field anymore at all, though I left it in place in the back branches in the name of ABI stability. Mat Arye and Tom Lane, per report from Mat Arye. Back-patch to all supported branches. Discussion: https://postgr.es/m/CADsUR0DvVgnZYWwnmKRK65MZg7YLUSTDLV61qdnrwtrAJgU6xw@mail.gmail.com
When deparsing an XMLTABLE() expression, XML namespace names were not quoted. However, since they are parsed as ColLabel tokens, some names require double quotes to ensure that they are properly interpreted. Fix by using quote_identifier() in the deparsing code. Back-patch to all supported versions. Dean Rasheed, reviewed by Tom Lane. Discussion: https://postgr.es/m/CAEZATCXTpAS%3DncfLNTZ7YS6O5puHeLg_SUYAit%2Bcs7wsrd9Msg%40mail.gmail.com
Commit 27a1f8d108 missed updating the max HBA option count to account for the new option added. Fix by bumping the counter and adjust the relevant comment to match. Backpatch down to all supported branches like the erroneous commit. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/286764.1736697356@sss.pgh.pa.us Backpatch-through: v13
This function expects an "int64" as result and stores the number of pages to add to the index scan bitmap as an "int", multiplying its final result by 10. For a relation large enough, this can theoretically overflow if counting more than (INT32_MAX / 10) pages, knowing that the number of pages is upper-bounded by MaxBlockNumber. To avoid the overflow, this commit redefines "totalpages", used to calculate the result, to be an "int64" rather than an "int". Reported-by: Evgeniy Gorbanyov Author: James Hunter Discussion: https://www.postgresql.org/message-id/07704817-6fa0-460c-b1cf-cd18f7647041@basealt.ru Backpatch-through: 13
If a new catalog tuple is inserted that belongs to a catcache list entry, and cache invalidation happens while the list entry is being built, the list entry might miss the newly inserted tuple. To fix, change the way we detect concurrent invalidations while a catcache entry is being built. Keep a stack of entries that are being built, and apply cache invalidation to those entries in addition to the real catcache entries. This is similar to the in-progress list in relcache.c. Back-patch to all supported versions. Reviewed-by: Noah Misch Discussion: https://www.postgresql.org/message-id/2234dc98-06fe-42ed-b5db-ac17384dc880@iki.fi
The ecpg command includes code to warn about unsupported COPY FROM STDIN statements in input files. However, since commit 3d009e4, this functionality has been broken due to a bug introduced in that commit, causing ecpg to fail to detect the statement. This commit resolves the issue, restoring ecpg's ability to detect COPY FROM STDIN and issue a warning as intended. Back-patch to all supported versions. Author: Ryo Kanbayashi Reviewed-by: Hayato Kuroda, Tom Lane Discussion: https://postgr.es/m/CANOn0Ez_t5uDCUEV8c1YORMisJiU5wu681eEVZzgKwOeiKhkqQ@mail.gmail.com
In the name of ABI stability (that is, to avoid a library major version bump for libpq), libpq still exports a version of pqsignal() that we no longer want to use ourselves. However, since that has the same link name as the function exported by src/port/pqsignal.c, there is a link ordering dependency determining which version will actually get used by code that uses libpq as well as libpgport.a. It now emerges that the wrong version has been used by pgbench and psql since commit 06843df4a rearranged their link commands. This can result in odd failures in pgbench with the -T switch, since its SIGALRM handler will now not be marked SA_RESTART. psql may have some edge-case problems in \watch, too. Since we don't want to depend on link ordering effects anymore, let's fix this in the same spirit as b6c7cfac8: use macros to change the actual link names of the competing functions. We cannot change legacy-pqsignal.c's exported name of course, so the victim has to be src/port/pqsignal.c. In master, rename its exported name to be pqsignal_fe in frontend or pqsignal_be in backend. (We could perhaps have gotten away with using the same symbol in both cases, but since the FE and BE versions now work a little differently, it seems advisable to use different names.) In back branches, rename to pqsignal_fe in frontend but keep it as pqsignal in backend. The frontend change could affect third-party code that is calling pqsignal from libpgport.a or libpgport_shlib.a, but only if the code is compiled against port.h from a different minor release than libpgport. Since we don't support using libpgport as a shared library, it seems unlikely that there will be such a problem. I left the backend symbol unchanged to avoid an ABI break for extensions. This means that the link ordering hazard still exists for any extension that links against libpq. However, none of our own extensions use both pqsignal() and libpq, and we're not making things any worse for third-party extensions that do. Report from Andy Fan, diagnosis by Fujii Masao, patch by me. Back-patch to all supported branches, as 06843df4a was. Discussion: https://postgr.es/m/87msfz5qv2.fsf@163.com
These facilities were originally in the recovery TAP test 039_end_of_wal.pl. A follow-up bug fix with a TAP test doing similar WAL manipulations requires them, and all these had better not be duplicated due to their complexity. The routine names are tweaked to use "wal" more consistently, similarly to the existing "advance_wal". In v14 and v13, the new routines are moved to PostgresNode.pm. 039_end_of_wal.pl is updated to use the refactored routines, without changing its coverage. Reviewed-by: Alexander Kukushkin Discussion: https://postgr.es/m/CAFh8B=mozC+e1wGJq0H=0O65goZju+6ab5AU7DEWCSUA2OtwDg@mail.gmail.com Backpatch-through: 13
We should run the expression subtrees of PartitionedRelPruneInfo structs through fix_scan_expr. Failure to do so means that AlternativeSubPlans within those expressions won't be cleaned up properly, resulting in "unrecognized node type" errors since v14. It seems fairly likely that at least some of the other steps done by fix_scan_expr are important here as well, resulting in as-yet- undetected bugs. Therefore, I've chosen to back-patch this to all supported branches including v13, even though the known symptom doesn't manifest in v13. Per bug #18778 from Alexander Lakhin. Discussion: https://postgr.es/m/18778-24cd399df6c806af@postgresql.org
This commit reverts 8f67f994e8ea (down to v13) and c3de0f9eed38 (down to v17), as these are proving to not be completely correct regarding two aspects: - In v17 and newer branches, c3de0f9eed38's check for epoch handling is incorrect, and does not correctly handle frozen epochs. A logic closer to widen_snapshot_xid() should be used. The 2PC code should try to integrate deeper with FullTransactionIds, 5a1dfde8334b being not enough. - In v13 and newer branches, 8f67f994e8ea is a workaround for the real issue, which is that we should not attempt CLOG lookups without reaching consistency. This exists since 728bd99, and this is reachable with ProcessTwoPhaseBuffer() called by restoreTwoPhaseData() at the beginning of recovery. Per discussion with Noah Misch. Discussion: https://postgr.es/m/20250116010051.f3.nmisch@google.com Backpatch-through: 13
XLogPageRead() checks immediately for an invalid WAL record header on a standby, to be able to handle the case of continuation records that need to be read across two different sources. As written, the check was too generic, applying to any target LSN. Based on an analysis by Kyotaro Horiguchi, what really matters is to make sure that the page header is checked when attempting to read a LSN at the boundary of a segment, to handle the case of a continuation record that spawns across multiple pages when dealing with multiple segments, as WAL receivers are spawned they request WAL from the beginning of a segment. This fix has been proposed by Kyotaro Horiguchi. This could cause standbys to loop infinitely when dealing with a continuation record during a timeline jump, in the case where the contents of the record in the follow-up page are invalid. Some regression tests are added to check such scenarios, able to reproduce the original problem. In the test, the contents of a continuation record are overwritten with junk zeros on its follow-up page, and replayed on standbys. This is inspired by 039_end_of_wal.pl, and is enough to show how standbys should react on promotion by not being stuck. Without the fix, the test would fail with a timeout. The test to reproduce the problem has been written by Alexander Kukushkin. The original check has been introduced in 0668719, for a similar problem. Author: Kyotaro Horiguchi, Alexander Kukushkin Reviewed-by: Michael Paquier Discussion: https://postgr.es/m/CAFh8B=mozC+e1wGJq0H=0O65goZju+6ab5AU7DEWCSUA2OtwDg@mail.gmail.com Backpatch-through: 13
If a WaitEventSetWait() caller asks for multiple events, an already set latch would previously prevent other events from being reported at the same time. Now, we'll also poll the kernel for other events that would fit in the caller's output buffer with a zero wait time. This policy change doesn't affect callers that ask for only one event. The main caller affected is the postmaster. If its latch is set extremely frequently by backends launching workers and workers exiting, we don't want it to handle only those jobs and ignore incoming client connections. Back-patch to 16 where the postmaster began using the API. The fast-return policy changed here is older than that, but doesn't cause any known problems in earlier releases. Reported-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/Z1n5UpAiGDmFcMmd%40nathan
The freshly-released 2025a version of tzdata has a refined estimate for the longitude of Manila, changing their value for LMT in pre-standardized-timezone days. This changes the output of one of our test cases. Since we need to be able to run with system tzdata files that may or may not contain this update, we'd better stop making that specific test. I switched it to use Asia/Singapore, which has a roughly similar UTC offset. That LMT value hasn't changed in tzdb since 2003, so we can hope that it's well established. I also noticed that this set of make_timestamptz tests only exercises zones east of Greenwich, which seems rather sad, and was not the original intent AFAICS. (We've already changed these tests once to stabilize their results across tzdata updates, cf 66b737c; it looks like I failed to consider the UTC-offset-sign aspect then.) To improve that, add a test with Pacific/Honolulu. That LMT offset is also quite old in tzdb, so we'll cross our fingers that it doesn't get improved. Reported-by: Christoph Berg <cb@df7cb.de> Discussion: https://postgr.es/m/Z46inkznCxesvDEb@msg.df7cb.de Backpatch-through: 13
The prototype in src/include/utils/numeric.h declared the function as returning 'const bool', which conflicts with the definition in numeric.c that returns plain 'bool'. gcc tolerates this (const on a non-pointer return is silently meaningless), but clang flags it as 'conflicting types' and the build fails on macOS. Introduced by 'apache#392 Export numeric interface to public'. Match the .c definition: plain 'bool'.
PG 16 upstream commit b55f62a ('Unify DLSUFFIX on Darwin') changed DLSUFFIX on macOS from .so to .dylib so the suffix would match both linkable shared libraries and dlopen'd modules. Cloudberry, however, has many places that still hard-code '$libdir/foo.so' — the catalog SQL bootstrap scripts, cdb_init.d, and a number of expected/*.out files. When PG sees an explicit '.so' suffix in a library reference it does NOT re-append DLSUFFIX, so a .so / .dylib divergence breaks the catalog bootstrap (FATAL: could not access file 'foo'). Restore the pre-PG16 behaviour of DLSUFFIX=.so on darwin (via src/template/darwin) so all those hard-coded references resolve. Two follow-on adjustments are needed: * configure: the Python-shared-library probe builds a candidate path as '$python_libdir/lib$ldlibrary$DLSUFFIX'. macOS Python ships its shared lib as .dylib regardless of what DLSUFFIX is set to for modules; without a .dylib fallback the probe fails with 'could not find shared library for Python'. Try .dylib alongside DLSUFFIX on darwin. * src/test/regress/GNUmakefile: a handful of install/uninstall lines hard-coded '.so' for the test-helper modules (regress, test_hook, query_info_hook_test). Replace with $(DLSUFFIX) so they keep working regardless of the value.
After this commit, plain 'make' (without any CUSTOM_COPT= on the
command line) builds cleanly on macOS with Apple clang. Previously
users had to remember a long override.
The warning categories that needed demoting (darwin only):
-Wuninitialized — clang flags spots upstream gcc accepts.
-Wgnu-variable-sized-type-not-at-end
— clang-only; fires on PG catalog headers
like pg_task.h with inline struct-plus-
trailing-text declarations.
-Wunused-function — clang flags static functions never
referenced in the TU (a few exist in
currently-disabled code paths, e.g.
ic_udpifc.c).
-Wdeprecated-non-prototype
— clang-only; flags K&R-style 'foo()'
forward declarations. Where the
mismatch is a real bug we fix it
inline; this demotion covers the rest.
The block is gated to PORTNAME=darwin so the Linux gcc build path
is unchanged: -Werror remains in effect for all categories there.
The forward declaration at line 838 was
static void initSndBufferPool();
which under C's old K&R rules means 'function with unspecified
arguments' — but the actual definition takes a SendBufferPool *:
static void
initSndBufferPool(SendBufferPool *p)
and the single caller at line 3677 passes &snd_buffer_pool. gcc
silently accepts the mismatch; Apple clang correctly rejects it
under -Wdeprecated-non-prototype. C2x will reject it on every
compiler. Match the forward declaration to the definition.
An duct tape for this was already added as fc8aab8, through redo path was not patched there. Copy same guard into redoDistributedCommitRecord function boby.
Previously, MERGE ... WHEN MATCHED THEN UPDATE SET <dist_key> = ... was rejected with "cannot update column in merge with distributed column". This commit adds support by extending the SplitMerge node to handle distribution key updates via DELETE + INSERT routing. Key changes: Planner (preptlist.c, createplan.c, setrefs.c, cdbpath.c): - Detect distribution key modification in MERGE UPDATE actions and set merge_need_split_update flag - Add all target table columns to subplan targetlist so SplitMerge can project complete rows for INSERT - Expand UPDATE action targetlists to include all columns (not just SET columns) using expand_insert_targetlist - Build SplitMerge targetlist in N+M+1 format: N target table columns + M subplan columns + 1 DMLAction column - Always use root table action lists (not per-partition adjusted lists) to ensure hashAttnos match root attribute numbers - Add set_splitmerge_tlist_references for proper OUTER_VAR conversion Executor (nodeSplitMerge.c, nodeModifyTable.c): - SplitMerge splits MATCHED UPDATE into DELETE + INSERT tuple pair, each routed to the correct segment via hash computation - NOT MATCHED rows get PASSTHROUGH action for normal ExecMerge processing - ModifyTable handles DMLAction-tagged tuples from SplitMerge output - Support lazy partition routing for CMD_MERGE DML_INSERT Refactoring (nodeSplitMerge.c): - Extract computeTargetSegment(), SwitchResultRelForPartition(), BuildRootUpdateTupleDesc() helper functions - Define SPLITMERGE_ACTION_PASSTHROUGH constant - Remove dead code and consolidate duplicated logic
Uncomment the two test blocks that verify view behavior when pg_depend entries are corrupted (DELETE FROM pg_depend + ALTER TABLE + ROLLBACK). These were previously disabled because DELETE FROM pg_depend only affects the coordinator in a distributed environment. The fix uses allow_segment_DML GUC combined with a helper function marked EXECUTE ON ALL SEGMENTS to delete from segment catalogs as well.
ORCA now correctly supports runtime filter pushdown, so uncomment the previously disabled test block that verifies pushdown behavior with optimizer on.
Restore register_dirty_segment() in mdcreate() and register_dirty_segment_ao() in ao_insert_replay() that were removed during PG16 merge, ensuring newly created or written segments are properly fsync'd at the next checkpoint. Fix two bugs that cause checkpointer PANIC on standby when processing AO fsync requests for truncated/dropped files: 1. aosyncfiletag(): return -1 instead of elog(ERROR) when file cannot be opened, matching mdsyncfiletag() behavior. 2. ao_truncate_replay(): send SYNC_FORGET_REQUEST after truncating an AO segment file to cancel previously registered fsync requests. Add register_forget_request_ao() as the AO counterpart to register_forget_request(), using SYNC_HANDLER_AO.
After a gang loss (e.g. QE terminated by pg_terminate_backend), the QD assigns a new gp_session_id via GpDropTempTables(), but the QD's pg_stat_activity entry (MyBEEntry->st_session_id) was never updated because pgstat_report_sessionid() was commented out and the function was removed during the PG16 merge. This caused pg_stat_activity.sess_id on QD to become stale, diverging from the actual gp_session_id. Any query that joins QD's pg_stat_activity with segment pg_stat_activity on sess_id would fail to match, making gp_sync_lc_gucs test flaky. Restore pgstat_report_sessionid() following the existing pgstat_report_resgroup() pattern, and uncomment the call in GpDropTempTables(). Update the gp_sync_lc_gucs expected output since the second pg_terminate_backend now correctly finds and terminates QEs.
…rror down The vacuum_progress tests simulate mirror failure during vacuum to test worker process change behavior. Previously, after stopping the mirror and resuming the walsender, the tests relied on passive FTS detection which could time out in CI. Add wait_for_mirror_down() helper that actively triggers FTS probe scans until mirror is confirmed down, ensuring the FTS version change propagates to QD and gang gets properly reset. Also update vacuum_progress_column expected output for post-cleanup phase progress values, which now reflect actual work done by the new vacuum worker (heap_blks_vacuumed, index_vacuum_count) after the worker change is reliably triggered.
0e88a45 to
438746c
Compare
Two related changes that together make PAX aux toasting match the
rest of the Cloudberry / Postgres tree:
1. src/backend/catalog/toasting.c
Drop the Cloudberry-only branch that routed TOAST for
pg_ext_aux parents back into pg_ext_aux:
else if (IsExtAuxNamespace(rel->rd_rel->relnamespace))
namespaceid = PG_EXTAUX_NAMESPACE;
PAX aux tables (pg_pax_blocks_<oid>) now get a normal TOAST
companion in pg_toast, the same as every other heap.
2. contrib/pax_storage/.../pax_aux_table.cc
Aux inherits the parent's persistence for PERMANENT / UNLOGGED
verbatim, but TEMP is clamped down to PERMANENT. Background:
the aux always lives in pg_ext_aux (not in pg_temp_<N>), so a
TEMP-persistence row in pg_ext_aux ends up mis-classified by
RELATION_IS_OTHER_TEMP — relcache.c sets rd_islocaltemp=false
because pg_ext_aux is not a temp namespace, and then
reindex_index() (or any catalog walk that touches the aux)
bails with "cannot reindex temporary tables of other
sessions". Clamping TEMP→PERMANENT avoids the mis-trigger;
the trade-off is that the aux of a TEMP PAX table outlives
the session, which is acceptable given the long-standing
FIXME in the same file ("temporary table in aux namespace is
not supported yet").
Resulting layout:
PERMANENT pax_tab aux in pg_ext_aux (p)
toast in pg_toast (p)
idx in pg_toast (p)
UNLOGGED u_pax aux in pg_ext_aux (u)
toast in pg_toast (u)
idx in pg_toast (u)
TEMP pax_tmp parent in pg_temp_<N> (t)
aux in pg_ext_aux (p) <-- clamped
toast in pg_toast (p)
idx in pg_toast (p)
Verified:
- CREATE TABLE / UNLOGGED / TEMP USING pax all produce the
layout above.
- INSERT round-trip works for all three persistence modes.
Commit e2d4ef8 (the fix for CVE-2017-7484) added security checks to the selectivity estimation functions to prevent them from running user-supplied operators on data obtained from pg_statistic if the user lacks privileges to select from the underlying table. In cases involving inheritance/partitioning, those checks were originally performed against the child RTE (which for plain inheritance might actually refer to the parent table). Commit 553d2ec then extended that to also check the parent RTE, allowing access if the user had permissions on either the parent or the child. It turns out, however, that doing any checks using the child RTE is incorrect, since securityQuals is set to NULL when creating an RTE for an inheritance child (whether it refers to the parent table or the child table), and therefore such checks do not correctly account for any RLS policies or security barrier views. Therefore, do the security checks using only the parent RTE. This is consistent with how RLS policies are applied, and the executor's ACL checks, both of which use only the parent table's permissions/policies. Similar checks are performed in the extended stats code, so update that in the same way, centralizing all the checks in a new function. In addition, note that these checks by themselves are insufficient to ensure that the user has access to the table's data because, in a query that goes via a view, they only check that the view owner has permissions on the underlying table, not that the current user has permissions on the view itself. In the selectivity estimation functions, there is no easy way to navigate from underlying tables to views, so add permissions checks for all views mentioned in the query to the planner startup code. If the user lacks permissions on a view, a permissions error will now be reported at planner-startup, and the selectivity estimation functions will not be run. Checking view permissions at planner-startup in this way is a little ugly, since the same checks will be repeated at executor-startup. Longer-term, it might be better to move all the permissions checks from the executor to the planner so that permissions errors can be reported sooner, instead of creating a plan that won't ever be run. However, such a change seems too far-reaching to be back-patched. Back-patch to all supported versions. In v13, there is the added complication that UPDATEs and DELETEs on inherited target tables are planned using inheritance_planner(), which plans each inheritance child table separately, so that the selectivity estimation functions do not know that they are dealing with a child table accessed via its parent. Handle that by checking access permissions on the top parent table at planner-startup, in the same way as we do for views. Any securityQuals on the top parent table are moved down to the child tables by inheritance_planner(), so they continue to be checked by the selectivity estimation functions. Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com> Backpatch-through: 13 Security: CVE-2025-8713
Maliciously-crafted object names could achieve SQL injection during restore. CVE-2012-0868 fixed this class of problem at the time, but later work reintroduced three cases. Commit bc8cd50 (back-patched to v11+ in 2023-05 releases) introduced the pg_dump case. Commit 6cbdbd9 (v12+) introduced the two pg_dumpall cases. Move sanitize_line(), unchanged, to dumputils.c so pg_dumpall has access to it in all supported versions. Back-patch to v13 (all supported versions). Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Backpatch-through: 13 Security: CVE-2025-8715
A malicious server could inject psql meta-commands into plain-text dump output (i.e., scripts created with pg_dump --format=plain, pg_dumpall, or pg_restore --file) that are run at restore time on the machine running psql. To fix, introduce a new "restricted" mode in psql that blocks all meta-commands (except for \unrestrict to exit the mode), and teach pg_dump, pg_dumpall, and pg_restore to use this mode in plain-text dumps. While at it, encourage users to only restore dumps generated from trusted servers or to inspect it beforehand, since restoring causes the destination to execute arbitrary code of the source superusers' choice. However, the client running the dump and restore needn't trust the source or destination superusers. Reported-by: Martin Rakhmanov Reported-by: Matthieu Denais <litezeraw@gmail.com> Reported-by: RyotaK <ryotak.mail@gmail.com> Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Security: CVE-2025-8714 Backpatch-through: 13
…ced extensions section
Feat: Add worker_sighup for SIGHUP to work correctly Co-authored-by: Vladislav Shchetinin <vshchetininv@qavm-62293b9d.qemu>
438746c to
070b61e
Compare
Add a gpcontrib debug extension that rejects full scans on partitioned ables when partition pruning is ineffective. The extension is built only with enable_debug_extensions=yes and supports Planner and ORCA plans, including Append, MergeAppend, PartitionSelector, and DynamicScan cases.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #ISSUE_Number
What does this PR do?
Type of Change
Breaking Changes
Test Plan
make installcheckmake -C src/test installcheck-cbdb-parallelImpact
Performance:
User-facing changes:
Dependencies:
Checklist
Additional Context
CI Skip Instructions