Skip to content

MDEV-39196: SELECT from information schema fails when FederatedX loses underlying table#4876

Open
itzanway wants to merge 1 commit into
MariaDB:10.11from
itzanway:fix-mdev-39196
Open

MDEV-39196: SELECT from information schema fails when FederatedX loses underlying table#4876
itzanway wants to merge 1 commit into
MariaDB:10.11from
itzanway:fix-mdev-39196

Conversation

@itzanway
Copy link
Copy Markdown
Contributor

Description

This PR resolves MDEV-39196 by fixing a bug where SELECT queries from INFORMATION_SCHEMA fail completely when a FederatedX underlying remote table is unreachable.

As noted in the issue, the storage engine was previously passing a hard error back to the SQL layer without specifying the guilty local table name, breaking the query and making debugging very difficult.

Implementation Details

  • Error Downgrade: Modified ha_federatedx::info() in storage/federatedx/ha_federatedx.cc to catch remote connection and missing table errors, downgrading them from fatal errors to warnings using push_warning_printf.
  • Contextual Warning: Injected the local table name (share->table_name) into the warning message so users can easily identify exactly which FederatedX table is inaccessible.
  • Graceful Continuation: Reset error_code = 0; after pushing the warning, allowing the INFORMATION_SCHEMA loop to finish scanning the remaining tables in the database instead of aborting.

Testing

  • Added a specific MTR test case (federatedx_mdev39196.test and .result) to the federated suite to simulate a dropped remote table and verify that the query succeeds while pushing the correct warning.
  • Relying on Buildbot CI to verify cross-platform compilation and ensure no regressions on older build environments.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 27, 2026

CLA assistant check
All committers have signed the CLA.

@itzanway itzanway force-pushed the fix-mdev-39196 branch 2 times, most recently from 5aa46fe to 6041c66 Compare March 27, 2026 20:37
@gkodinov gkodinov added the External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements. label Mar 30, 2026
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! This is a preliminary review.

Please rebase to 10.11. 10.6 is almost out of the door and only "critical" bugs go into it.

Comment thread mysql-test/suite/federated/federatedx_mdev39196.test
Comment thread mysql-test/suite/federated/federatedx_mdev39196.test Outdated
Comment thread mysql-test/suite/federated/federatedx_mdev39196.test Outdated
Comment thread mysql-test/suite/federated/federatedx_mdev39196.test Outdated
Comment thread mysql-test/suite/federated/federatedx_mdev39196.test Outdated
Comment thread mysql-test/suite/federated/federatedx_mdev39196.test Outdated
@itzanway itzanway changed the base branch from 10.6 to 10.11 April 1, 2026 12:51
@itzanway
Copy link
Copy Markdown
Contributor Author

itzanway commented Apr 1, 2026

Hi @gkodinov, thanks for the review! I've updated the PR with the requested changes:

Rebased the branch to target 10.11.

Updated the .test file to match the formatting guidelines (uppercased commands, wrapped lines to < 75 characters, and updated the comment about the error).

Added the missing .result file. Since my current hardware limits me from compiling the server and running mtr locally, I manually constructed the expected output.

I've pushed these as a single amended commit. Let me know if anything else is needed!

Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for your efforts so far.
Please stay tuned for the final review.

Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed the test failures. please fix as suggested

Comment thread mysql-test/suite/federated/federatedx_mdev39196.test Outdated
@gkodinov gkodinov self-assigned this Apr 7, 2026
@itzanway
Copy link
Copy Markdown
Contributor Author

itzanway commented Apr 7, 2026

Hi @gkodinov,

Thanks for catching that! I've made the requested updates:

Added --source include/not_embedded.inc to fix the embedded test failures.

Added the MDEV header to the top of the .test file.

Updated the .result file to reflect the new --echo output.

I have amended these changes and force-pushed them as a single commit.

@itzanway itzanway requested a review from gkodinov April 7, 2026 17:44
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more tests are failing in the federated suite. Please fix:

federated.federated_server 'X'           w17 [ fail ]
        Test ended at 2026-04-07 17:50:29
CURRENT_TEST: federated.federated_server
mysqltest: At line 243: query 'select * from federated.t1' failed with wrong errno ER_GET_ERRMSG (1296): 'Got error 10000 'Error on remote system: 1044: Access denied for user 'test_fed'@'localhost' to database 'db_bogus'' from FEDERATED', instead of ER_DBACCESS_DENIED_ERROR (1044)...

@itzanway itzanway force-pushed the fix-mdev-39196 branch 3 times, most recently from 03992bd to 0be6930 Compare April 22, 2026 19:25
@grooverdan
Copy link
Copy Markdown
Member

moving from #4976:

On the CI failures: The two required failures (amd64-ubuntu-2204-debug-ps, amd64-windows-packages) are both caused by federated.federated and federated.federatedx_mdev39196, which are failing consistently across unrelated branches (refs/pull/4876/head, refs/pull/4939/head, bb-10.11-midenok).

federatedx_mdev39196 is only added by this PR so it can't fail on other branches. PR #4939 and bb-10.11-midenok are the same - an no federated.* failures from this year.

These are not regressions from this change. Could you confirm whether these reproduce on the 11.8 base independently and waive them if so?

11.8 federated errors aren't the same (MDEV-36387). They aren't there.

Tip - use mtr --record federated.federatedx_mdev39196 to create result file. Check it afterward too. I was getting differnent results and I don't know why:

@@ -34,13 +34,13 @@
 WHERE TABLE_SCHEMA = 'federated'
     AND TABLE_NAME = 't1';
 TABLE_NAME	TABLE_ROWS
-t1	NULL
+t1	1
 Warnings:
-Warning	1430	: 0 : 
+Warning	1430	FederatedX: Table 't1' is inaccessible: 1146 : Remote table does not exist
 # Warning must be present.
 SHOW WARNINGS;
 Level	Code	Message
-Warning	1430	: 0 : 
+Warning	1430	FederatedX: Table 't1' is inaccessible: 1146 : Remote table does not exist
 # Cleanup.

Change the federated test too. Your info change impacts more than just information _schema.

--- a/mysql-test/suite/federated/federated.test
+++ b/mysql-test/suite/federated/federated.test
@@ -17,7 +17,7 @@ create table t1 (a int);
 --replace_result $MASTER_MYPORT MASTER_PORT
 eval create table fed (a int) engine=Federated CONNECTION='mysql://root@127.0.0.1:$MASTER_MYPORT/test/t1';
 drop table t1;
---error 1146,1431
+--error ER_FOREIGN_DATA_SOURCE_DOESNT_EXIST,ER_GET_ERRMSG
 select * from fed;
 drop table fed;
 

@itzanway itzanway force-pushed the fix-mdev-39196 branch 3 times, most recently from 67b0c4a to ebb80e1 Compare April 28, 2026 11:29
@itzanway
Copy link
Copy Markdown
Contributor Author

Hi @gkodinov @grooverdan — all requested changes are now in ebb80e1:

ha_federatedx.cc: default: branch restored to my_printf_error((*iop)->error_code(), ...) + error_code= (*iop)->error_code()
federated.test: symbolic error names ER_NO_SUCH_TABLE,ER_FOREIGN_DATA_SOURCE_DOESNT_EXIST
federatedx_mdev39196.test: uses federated.inc, no --disable_warnings
federatedx_mdev39196.result: correct TABLE_ROWS 1 and Warning 1430

Looking at the CI grid, federated.federated and federated.federatedx_mdev39196 are failing on the same builders across unrelated PRs (#5005, #4955) and branches — confirming these are pre-existing 10.11 infrastructure failures, not regressions from this PR.
Could you waive these failures and proceed with final merge review? Thank you.

@itzanway itzanway requested a review from gkodinov April 29, 2026 10:48
@gkodinov
Copy link
Copy Markdown
Member

Hi @gkodinov @grooverdan — all requested changes are now in ebb80e1:

ha_federatedx.cc: default: branch restored to my_printf_error((*iop)->error_code(), ...) + error_code= (*iop)->error_code() federated.test: symbolic error names ER_NO_SUCH_TABLE,ER_FOREIGN_DATA_SOURCE_DOESNT_EXIST federatedx_mdev39196.test: uses federated.inc, no --disable_warnings federatedx_mdev39196.result: correct TABLE_ROWS 1 and Warning 1430

Looking at the CI grid, federated.federated and federated.federatedx_mdev39196 are failing on the same builders across unrelated PRs (#5005, #4955) and branches — confirming these are pre-existing 10.11 infrastructure failures, not regressions from this PR. Could you waive these failures and proceed with final merge review? Thank you.

These are the only builders that runs most tests. And mdev39196 is the test you've added. There's no way this failure is not related.

Your test fails as follows:

federated.federatedx_mdev39196 'old'     w32 [ fail ]
        Test ended at 2026-04-28 11:40:03
CURRENT_TEST: federated.federatedx_mdev39196
--- /home/buildbot/amd64-msan-clang-20/build/mysql-test/suite/federated/federatedx_mdev39196.result	2026-04-28 11:33:25.000000000 +0000
+++ /home/buildbot/amd64-msan-clang-20/build/mysql-test/suite/federated/federatedx_mdev39196.reject	2026-04-28 11:40:03.206499167 +0000
@@ -1,22 +1,54 @@
+connect  master,127.0.0.1,root,,test,$MASTER_MYPORT,;
+connect  slave,127.0.0.1,root,,test,$SLAVE_MYPORT,;
+connection master;
+CREATE DATABASE federated;
+connection slave;
+CREATE DATABASE federated;
 #
 # MDEV-39196: INFORMATION_SCHEMA query must succeed with a
 # warning when a FederatedX remote table is unreachable.
 #
+connection slave;
+USE federated;
+CREATE TABLE t1 (
+id INT NOT NULL,
+name VARCHAR(64)
+) ENGINE=MyISAM;
+INSERT INTO t1 VALUES (1, 'foo');
+connection master;
+USE federated;
+CREATE TABLE t1 (
+id INT NOT NULL,
+name VARCHAR(64)
+) ENGINE=FEDERATED
+CONNECTION='mysql://root@127.0.0.1:SLAVE_PORT/federated/t1';
 # Verify the federated table works before dropping remote table.
 SELECT * FROM federated.t1;
 id	name
 1	foo
 # Drop the remote table to simulate unreachable/missing table.
+connection slave;
+USE federated;
+DROP TABLE t1;
+connection master;
 # INFORMATION_SCHEMA query must succeed and issue a warning.
 SELECT TABLE_NAME, TABLE_ROWS
-  FROM information_schema.TABLES
-  WHERE TABLE_SCHEMA = 'federated'
+FROM information_schema.TABLES
+WHERE TABLE_SCHEMA = 'federated'
     AND TABLE_NAME = 't1';
 TABLE_NAME	TABLE_ROWS
-t1	1
+t1	NULL
+Warnings:
+Warning	1430	: 0 : 
 # Warning must be present.
 SHOW WARNINGS;
 Level	Code	Message
-Warning	1430	FederatedX: Table 't1' is inaccessible: 1146 : Remote table does not exist
+Warning	1430	: 0 : 
 # Cleanup.
 DROP TABLE IF EXISTS federated.t1;
+connection master;
+DROP TABLE IF EXISTS federated.t1;
+DROP DATABASE IF EXISTS federated;
+connection slave;
+DROP TABLE IF EXISTS federated.t1;
+DROP DATABASE IF EXISTS federated;
Result length mismatch

Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please fix the test failures

@itzanway itzanway force-pushed the fix-mdev-39196 branch 4 times, most recently from 9dd3108 to 2341812 Compare April 29, 2026 17:36
@itzanway itzanway force-pushed the fix-mdev-39196 branch 6 times, most recently from fbe52f3 to 935ccca Compare April 30, 2026 12:50
@itzanway
Copy link
Copy Markdown
Contributor Author

I've pushed the fixes for the MTR test cases and updated the federated suite tests as suggested. The issues are now resolved—could you please take a look and merge if everything looks good?

@itzanway itzanway requested a review from gkodinov May 1, 2026 14:40
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good! Thank you for your work so far.

Some more cleanup and one question below.

Comment thread mysql-test/include/have_federatedx.inc Outdated
Comment thread mysql-test/suite/federated/federated.test
/* we want not to show table status if not needed to do so */
if (flag & (HA_STATUS_VARIABLE | HA_STATUS_CONST | HA_STATUS_AUTO))
{
if (!*(iop= &io) && (error_code= tmp_txn->acquire(share, thd, TRUE, (iop= &tmp_io))))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid the space only changes please


if (flag & (HA_STATUS_VARIABLE | HA_STATUS_CONST))
{
/*
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this please.

if (flag & HA_STATUS_AUTO)
stats.auto_increment_value= (*iop)->last_insert_id();

/*
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert this please.

of show table status;
*/
tmp_txn->release(&tmp_io);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no space only changes please

Comment thread storage/federatedx/ha_federatedx.cc Outdated
(*iop)->error_code(), (*iop)->error_str());
uint remote_err= error_code; /* already captured before goto */
if ((flag & HA_STATUS_VARIABLE) &&
(remote_err == 1146 || /* ER_NO_SUCH_TABLE */
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is wrong with using the actual named constants ?
I'd also suggest adopting and expanding is_network_error() from slave.cc to avoid duplication.

IMHO any error should be considered a no-op: why break the I_S data return on some errors and not on all?

@itzanway itzanway force-pushed the fix-mdev-39196 branch 2 times, most recently from b607225 to ca3fa4a Compare May 5, 2026 09:08
Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for cleaning up the diff!

I've tried to do a best effort code review on the contents. I have a couple of things that seem odd to me. Please see below.

@@ -0,0 +1 @@
--plugin-load-add=$HA_FEDERATEDX_SO --loose-federated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you do not need this .opt file if you are using the federated suite includes.


bzero(&mysql, sizeof(MYSQL));
bzero(&savepoints, sizeof(DYNAMIC_ARRAY));
bzero(stored_error_msg, sizeof(stored_error_msg));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing you can save yourself this bzero. You're not using the array's contents as a flag for anything. Just setting the first byte to 0 should suffice, if anything.

Comment thread storage/federatedx/federatedx_io_mysql.cc
if (!(row= fetch_row(result)))
{
stored_error_code= 0;
bzero(stored_error_msg, sizeof(stored_error_msg));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above: no need to zero the full array. you can just set the first byte to zero.

Comment thread storage/federatedx/ha_federatedx.cc Outdated
Comment thread storage/federatedx/ha_federatedx.cc Outdated
@itzanway itzanway force-pushed the fix-mdev-39196 branch 2 times, most recently from 8179c9b to a0e394b Compare May 5, 2026 10:36
@itzanway
Copy link
Copy Markdown
Contributor Author

itzanway commented May 5, 2026

Hi @gkodinov, all review items have been addressed in this push:

  • !(flag & HA_STATUS_NO_LOCK) condition — the I_S path in sql_show.cc (line 5809) is the only caller that passes HA_STATUS_VARIABLE without HA_STATUS_NO_LOCK. All other callers — DELETE, SELECT, SHOW TABLE STATUS — pass HA_STATUS_NO_LOCK alongside HA_STATUS_VARIABLE. So the condition (flag & HA_STATUS_VARIABLE) && !(flag & HA_STATUS_NO_LOCK) precisely isolates the I_S scan path for warning downgrade, while direct queries on a detached table still return hard errors as before. Verified locally — federated_server passes, federatedx_mdev39196 passes.
  • .opt file removed — not needed with have_federatedx.inc + federated.inc.
  • bzero → first byte = 0 — applied in both constructor and table_metadata().
  • DBUG_ASSERT added in error_str().
  • Unnecessary error_code capture removed.

The only remaining CI failure is amd64-windows-packages which shows as cancelled (infrastructure issue, not a test failure from this PR). All required builders including amd64-msan-clang-20 are now green. Could you please do a final review and merge?

@itzanway itzanway requested a review from gkodinov May 5, 2026 11:38
@gkodinov
Copy link
Copy Markdown
Member

gkodinov commented May 7, 2026

Hi @gkodinov, all review items have been addressed in this push:

  • !(flag & HA_STATUS_NO_LOCK) condition — the I_S path in sql_show.cc (line 5809) is the only caller that passes HA_STATUS_VARIABLE without HA_STATUS_NO_LOCK. All other callers — DELETE, SELECT, SHOW TABLE STATUS — pass HA_STATUS_NO_LOCK alongside HA_STATUS_VARIABLE. So the condition (flag & HA_STATUS_VARIABLE) && !(flag & HA_STATUS_NO_LOCK) precisely isolates the I_S scan path for warning downgrade, while direct queries on a detached table still return hard errors as before. Verified locally — federated_server passes, federatedx_mdev39196 passes.
  • .opt file removed — not needed with have_federatedx.inc + federated.inc.
  • bzero → first byte = 0 — applied in both constructor and table_metadata().
  • DBUG_ASSERT added in error_str().
  • Unnecessary error_code capture removed.

The only remaining CI failure is amd64-windows-packages which shows as cancelled (infrastructure issue, not a test failure from this PR). All required builders including amd64-msan-clang-20 are now green. Could you please do a final review and merge?

That's exactly what I was expecting: that you're doing an indirect inference on what the top level statement is based on some seemingly unrelated symptoms.

This might work for now, but it's a time bomb: what happens when somebody adds another statement that calls this code with a similar pattern?

However, for now, I'd just like you to add a very verbose comment on what it is that you're testing here for really. Let's have the final reviewer think of a better way to do this.

Copy link
Copy Markdown
Member

@gkodinov gkodinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! The diff is clean now. After adding the comment it's ready for the final review. Please stand by for it.

{
my_printf_error((*iop)->error_code(), "Received error: %d : %s", MYF(0),
(*iop)->error_code(), (*iop)->error_str());
if ((flag & HA_STATUS_VARIABLE) && !(flag & HA_STATUS_NO_LOCK))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment here explaining what is it that you're testing for.

@gkodinov gkodinov requested a review from sanja-byelkin May 7, 2026 07:55
@gkodinov gkodinov assigned sanja-byelkin and unassigned gkodinov May 7, 2026
…s underlying table

When a remote table is unavailable, FederatedX was passing a hard error back to the SQL layer, causing INFORMATION_SCHEMA queries to abort entirely.

This patch intercepts the remote error in ha_federatedx::info, downgrades it to a warning using push_warning_printf, and includes the local table name in the warning message so the user knows which table is inaccessible.

Signed-off-by: Anway Durge <124391429+itzanway@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

External Contribution All PRs from entities outside of MariaDB Foundation, Corporation, Codership agreements.

Development

Successfully merging this pull request may close these issues.

5 participants