Fixes #29257: prevent endless pending-migration loop on latin1 databases#29258
Fixes #29257: prevent endless pending-migration loop on latin1 databases#29258govardhankarne1 wants to merge 1 commit into
Conversation
…latin1 databases On a MySQL database whose default charset is latin1, the migration tracking tables SERVER_CHANGE_LOG / SERVER_MIGRATION_SQL_LOGS are created without an explicit charset and inherit latin1. A 1.13.0 migration statement carries a non-latin1 character in its leading comment; Flyway keeps the comment in the statement text, latin1 cannot store it, and the INSERT that records the statement into SERVER_MIGRATION_SQL_LOGS fails. That failure was swallowed silently, so the statement was never recorded and the server reported pending migrations on every boot - an endless loop that migrate --force did not fix. - MigrationWorkflow: normalize the two tracking tables to utf8mb4 on MySQL before running migrations (no-op when already utf8mb4; no-op on PostgreSQL). - flyway v000__create_server_change_log.sql: pin DEFAULT CHARSET=utf8mb4 so fresh installs create the tables correctly. - MigrationProcessImpl: log a warning instead of silently swallowing a failed statement-recording insert, so the cause is diagnosable. - Tests for the charset decision and for the non-silent recording failure. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
❌ PR checklist incompleteThis PR cannot be merged until the following are addressed on its linked issue:
The fields live on the linked issue in the Shipping project (open the issue → right sidebar → Projects). After you set them, re-run this check (or push a commit) — issue/project changes do not re-trigger it automatically. Maintainers can bypass this check by adding the |
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
| private void ensureMigrationTrackingTablesUseUtf8mb4() { | ||
| if (connectionType == ConnectionType.MYSQL) { | ||
| try (Handle handle = jdbi.open()) { | ||
| for (final String tableName : MIGRATION_TRACKING_TABLES) { | ||
| convertTableToUtf8mb4IfNeeded(handle, tableName); | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| private void convertTableToUtf8mb4IfNeeded(final Handle handle, final String tableName) { | ||
| final String currentCharset = getTableCharset(handle, tableName); | ||
| if (needsUtf8mb4Conversion(currentCharset)) { | ||
| LOG.info( | ||
| "Converting migration table {} from charset '{}' to utf8mb4", tableName, currentCharset); |
There was a problem hiding this comment.
⚠️ Edge Case: Charset conversion failure aborts all migrations at startup
ensureMigrationTrackingTablesUseUtf8mb4() is the first statement in runMigrationWorkflows(), and neither it nor convertTableToUtf8mb4IfNeeded() wraps the ALTER TABLE ... CONVERT TO CHARACTER SET utf8mb4 in a try/catch. If the ALTER throws (the most realistic case is SERVER_MIGRATION_SQL_LOGS, whose PRIMARY KEY checksum VARCHAR(256) becomes 256*4=1024 bytes after conversion to utf8mb4 — this exceeds the 767-byte index limit on tables stored with the legacy COMPACT/REDUNDANT row format that pre-8.0 latin1 installs may still carry), the exception propagates and aborts the entire migration workflow before any migration runs. That is a regression versus the current behavior, where a latin1 install at least boots (looping on pending migrations). Since the explicit goal of this method is to self-heal existing installs, an unexpected conversion failure should be logged and the workflow should continue rather than hard-failing startup.
Catch and log conversion failures so a self-heal attempt cannot block server startup.:
private void convertTableToUtf8mb4IfNeeded(final Handle handle, final String tableName) {
final String currentCharset = getTableCharset(handle, tableName);
if (needsUtf8mb4Conversion(currentCharset)) {
LOG.info(
"Converting migration table {} from charset '{}' to utf8mb4", tableName, currentCharset);
try {
handle.execute(String.format("ALTER TABLE %s CONVERT TO CHARACTER SET utf8mb4", tableName));
} catch (RuntimeException conversionFailure) {
LOG.warn(
"Failed to convert migration table {} to utf8mb4; migrations may still fail to record",
tableName, conversionFailure);
}
}
}
- Apply fix
Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎
| migrationDAO.upsertServerMigrationSQL(version, sql, hash(sql)); | ||
| } catch (Exception logException) { | ||
| // If logging fails (table doesn't exist yet), continue - the SQL was executed | ||
| // successfully | ||
| // SERVER_MIGRATION_SQL_LOGS may legitimately not exist yet on the very first | ||
| // migration. In any other case a failure here (e.g. a table charset that cannot | ||
| // store the statement text) leaves the migration unrecorded, so the server keeps | ||
| // reporting it as pending on every boot. Surface it instead of failing silently. | ||
| LOG.warn( | ||
| "Failed to record migration statement for version {} in SERVER_MIGRATION_SQL_LOGS" | ||
| + " (checksum {}); it may be re-detected as pending", | ||
| version, | ||
| hash(sql), | ||
| logException); |
There was a problem hiding this comment.
💡 Quality: hash(sql) recomputed in the recording-failure log path
In the catch block, hash(sql) is computed a second time (the first computation was passed to upsertServerMigrationSQL) just to include the checksum in the WARN message. hash() is a digest computation; recomputing it on the error path is wasteful and risks drift if the hashing logic ever changes. Compute the hash once before the try and reuse the local variable in both the upsert call and the log statement.
Fix:
String checksum = hash(sql);
try {
migrationDAO.upsertServerMigrationSQL(version, sql, checksum);
} catch (Exception logException) {
LOG.warn(
"Failed to record migration statement for version {} in SERVER_MIGRATION_SQL_LOGS"
+ " (checksum {}); it may be re-detected as pending",
version,
checksum,
logException);
}
- Apply fix
Check the box to apply the fix or reply for a change | Was this helpful? React with 👍 / 👎
Code Review
|
| Compact |
|
Was this helpful? React with 👍 / 👎 | Gitar
| private void ensureMigrationTrackingTablesUseUtf8mb4() { | ||
| if (connectionType == ConnectionType.MYSQL) { | ||
| try (Handle handle = jdbi.open()) { | ||
| for (final String tableName : MIGRATION_TRACKING_TABLES) { | ||
| convertTableToUtf8mb4IfNeeded(handle, tableName); | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Unhandled exception aborts all migrations on charset-check failure.
ensureMigrationTrackingTablesUseUtf8mb4 has no try-catch, so any unexpected failure — e.g., a transient DB connectivity hiccup, a MySQL user that lacks SELECT on information_schema, or an ALTER TABLE the DB refuses — propagates directly out of runMigrationWorkflows and stops every migration from running. Given that this check is a self-healing guard rail, logging the error and continuing (or wrapping it so only a human-readable message surfaces) might be more resilient. If keeping it as a hard failure, a descriptive exception message would help operators diagnose the abort without reading through stack traces.
| checksum VARCHAR(256) PRIMARY KEY, | ||
| executedAt TIMESTAMP DEFAULT CURRENT_TIMESTAMP | ||
| ); No newline at end of file | ||
| ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci; No newline at end of file |
There was a problem hiding this comment.
The file is missing a trailing newline on the last line. This is inconsistent with standard POSIX text-file convention and causes noisy git diffs. Adding a newline after the final
; would align with the rest of the repo's SQL files.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Describe your changes:
Fixes #29257
On a MySQL database whose default charset is
latin1, the migration tracking tablesSERVER_CHANGE_LOG/SERVER_MIGRATION_SQL_LOGSare created (flyway v000) without an explicit charset and inheritlatin1. A 1.13.0 migration statement carries a non-latin1 character (U+2194) in its leading comment; Flyway keeps the comment in the statement text,latin1can't store it, and the INSERT that records the statement intoSERVER_MIGRATION_SQL_LOGSfails. That failure was swallowed silently, so the statement was never recorded and the server reported pending migrations on every boot — an endless loop thatmigrate --forcedid not fix.Changes:
MigrationWorkflow: before running migrations on MySQL, normalize the two tracking tables toutf8mb4(no-op if already utf8mb4; no-op on PostgreSQL). Self-heals existing latin1 installs.v000__create_server_change_log.sql: pinDEFAULT CHARSET=utf8mb4so fresh installs create the tables correctly.MigrationProcessImpl: log a warning instead of silently swallowing a failed statement-recording insert, so this is diagnosable instead of invisible.Testing:
MigrationTrackingTableCharsetTest: charset decision (latin1/utf8mb3 -> convert; utf8mb4/unknown -> skip).MigrationProcessImplTest.warnsButDoesNotAbortWhenRecordingMigrationStatementFails: a recording failure is logged at WARN and does not abort the migration.Related to #28255 (same MySQL-charset-on-upgrade theme; this PR covers the migration-tracking tables and the silent recording failure rather than
openmetadata_db/FK charset).Type of change:
Checklist:
Fixes <issue-number>: <short explanation>Greptile Summary
This PR fixes an endless pending-migration loop on MySQL databases whose default charset is
latin1. The tracking tables (SERVER_CHANGE_LOG/SERVER_MIGRATION_SQL_LOGS) previously inheritedlatin1, which cannot store non-latin1 characters found in certain migration statement comments; the resulting failed INSERT was swallowed silently, leaving migrations perpetually re-detected as pending.v000__create_server_change_log.sql: pins both tracking tables toENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci, consistent with every other MySQL DDL in the repo, fixing the issue for fresh installs.MigrationWorkflow: addsensureMigrationTrackingTablesUseUtf8mb4()called at the top ofrunMigrationWorkflows, which detects and converts existing latin1/utf8mb3 tracking tables to utf8mb4 usinginformation_schema(no-op on PostgreSQL and already-utf8mb4 tables).MigrationProcessImpl: replaces the silent catch block onupsertServerMigrationSQLfailures with aLOG.warnthat includes version, checksum, and the exception, making previously invisible recording failures diagnosable in logs.Confidence Score: 4/5
The change is well-scoped and directly addresses the described bug; the self-healing ALTER TABLE is safe to run on existing installs, the DDL change is consistent with the rest of the codebase, and both new code paths have unit test coverage.
The charset-detection and conversion path in MigrationWorkflow has no exception handling: any unexpected failure — a transient DB issue, a permission gap, or an ALTER TABLE refusal — propagates out of runMigrationWorkflows and stops all migrations from executing. This could make an upgrade worse on some installs than the original bug it fixes. Everything else in the PR is clean.
openmetadata-service/src/main/java/org/openmetadata/service/migration/api/MigrationWorkflow.java — specifically the ensureMigrationTrackingTablesUseUtf8mb4 method and its lack of exception handling.
Important Files Changed
Flowchart
%%{init: {'theme': 'neutral'}}%% flowchart TD A[runMigrationWorkflows] --> B{connectionType == MYSQL?} B -- No --> D[Skip charset check] B -- Yes --> C[Open JDBI Handle] C --> E[For each tracking table] E --> F[getTableCharset via information_schema] F --> G{charset == null?} G -- Yes: table doesn't exist --> H[Skip — no-op] G -- No --> I{startsWith 'utf8mb4'?} I -- Yes: already correct --> J[Skip — no-op] I -- No: latin1 or utf8mb3 --> K[ALTER TABLE CONVERT TO CHARACTER SET utf8mb4] K --> L{Exception?} L -- Yes --> M[Propagates — aborts all migrations] L -- No --> N[Charset fixed] D --> O[Run migrations] H --> O J --> O N --> O O --> P[performSqlExecutionAndUpdate] P --> Q[handle.execute SQL] Q --> R[upsertServerMigrationSQL] R --> S{Exception?} S -- Yes --> T[LOG.warn + continue] S -- No --> U[QueryStatus SUCCESS] T --> U%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%% flowchart TD A[runMigrationWorkflows] --> B{connectionType == MYSQL?} B -- No --> D[Skip charset check] B -- Yes --> C[Open JDBI Handle] C --> E[For each tracking table] E --> F[getTableCharset via information_schema] F --> G{charset == null?} G -- Yes: table doesn't exist --> H[Skip — no-op] G -- No --> I{startsWith 'utf8mb4'?} I -- Yes: already correct --> J[Skip — no-op] I -- No: latin1 or utf8mb3 --> K[ALTER TABLE CONVERT TO CHARACTER SET utf8mb4] K --> L{Exception?} L -- Yes --> M[Propagates — aborts all migrations] L -- No --> N[Charset fixed] D --> O[Run migrations] H --> O J --> O N --> O O --> P[performSqlExecutionAndUpdate] P --> Q[handle.execute SQL] Q --> R[upsertServerMigrationSQL] R --> S{Exception?} S -- Yes --> T[LOG.warn + continue] S -- No --> U[QueryStatus SUCCESS] T --> UReviews (1): Last reviewed commit: "Fixes #29257: prevent endless pending-mi..." | Re-trigger Greptile