Skip to content

Issue 5755 change advisory lock to try advisory lock#5809

Open
onno-vos-dev wants to merge 1 commit into
prisma:mainfrom
onno-vos-dev:issue-5755-change-advisory-lock-to-try-advisory-lock
Open

Issue 5755 change advisory lock to try advisory lock#5809
onno-vos-dev wants to merge 1 commit into
prisma:mainfrom
onno-vos-dev:issue-5755-change-advisory-lock-to-try-advisory-lock

Conversation

@onno-vos-dev

@onno-vos-dev onno-vos-dev commented May 6, 2026

Copy link
Copy Markdown

Full disclosure: This is very AI assisted code but given my detailed description, it would have ended up in a similar state.


I noticed that sometimes migrations using Prisma can result in deadlocks when more than one instance is trying to run the migrations at a time. This is attempted to be mitigated through the use of advisory locks as can be seen

crosstarget_utils::time::timeout(
ADVISORY_LOCK_TIMEOUT,
connection.raw_cmd("SELECT pg_advisory_lock(72707369)"),
)
.

Unfortunately the ADVISORY_LOCK_TIMEOUT

const ADVISORY_LOCK_TIMEOUT: time::Duration = time::Duration::from_secs(10);
still leaves the opportunity for a race condition to occur because the 2nd instance of a service holds on the advisory lock leading to a deadlock scenario.

As per the documentation of Postgres: https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS there are two types of advisory locks that can be obtained here, one that is currently in use pg_advisory_lock and one that (IMHO) is the better one since it would return instantly with a boolean f if it fails to obtain the lock. Essentially mitigating the risk of race conditions and deadlocks altogether.

Proposal would be to change from pg_advisory_lock to pg_try_advisory_lock and handle the change from a timeout error type of scenario to a simple boolean shortcircuit of the failure to obtain the lock as observed here:

connector.acquire_lock().await?;
connector
.migration_persistence()
.initialize(namespaces, input.filters.into())
.await?;
and log and return that another instance must be running the migration instead at that point.

pg_advisory_lock ( key bigint ) → void
pg_advisory_lock ( key1 integer, key2 integer ) → void
Obtains an exclusive session-level advisory lock, waiting if necessary.
...
pg_try_advisory_lock ( key bigint ) → boolean
pg_try_advisory_lock ( key1 integer, key2 integer ) → boolean
Obtains an exclusive session-level advisory lock if available. This will either obtain the lock immediately and return true, or return false without waiting if the lock cannot be acquired immediately.

One can mimic the failure on a local Postgres quite easily by following below steps:

-- Shell 1:
CREATE TABLE derp (fname text);

-- Shell 1:
SELECT pg_advisory_lock(72707369);

-- Shell 2:
SELECT pg_advisory_lock(72707369); -- This will hang

-- Shell 1:
CREATE INDEX CONCURRENTLY idx_fname_derp ON derp (fname);

Produces following logs:

2026-02-05 10:42:23.375 UTC [18224] ERROR:  deadlock detected
2026-02-05 10:42:23.375 UTC [18224] DETAIL:  Process 18224 waits for ShareLock on virtual transaction 6/1259; blocked by process 18225.
	Process 18225 waits for ExclusiveLock on advisory lock [13757,0,72707369,1]; blocked by process 18224.
	Process 18224: CREATE INDEX CONCURRENTLY idx_fname_derp ON derp (fname);
	Process 18225: SELECT pg_advisory_lock(72707369);
2026-02-05 10:42:23.375 UTC [18224] HINT:  See server log for query details.
2026-02-05 10:42:23.375 UTC [18224] STATEMENT:  CREATE INDEX CONCURRENTLY idx_fname_derp ON derp (fname);

@CLAassistant

CLAassistant commented May 6, 2026

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@coderabbitai

coderabbitai Bot commented May 6, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Converted the PostgreSQL migration locking mechanism from a blocking advisory lock with timeout to a non-blocking pg_try_advisory_lock approach. When the lock cannot be acquired, an error is returned immediately instead of waiting. Supporting imports were adjusted, the WASM module import was cleaned up, and tests were refactored to handle the new error result patterns without unwrap calls.

Changes

Advisory Lock Non-Blocking Refactor

Layer / File(s) Summary
Core Lock Implementation
schema-engine/connectors/sql-schema-connector/src/flavour/postgres.rs
Updated imports by removing time-related code and adding url::Url and DatabaseSchemaInconsistent. Replaced blocking pg_advisory_lock with timeout with non-blocking pg_try_advisory_lock; the acquire_lock function now interprets the boolean result and returns an error immediately when the lock cannot be acquired instead of timing out.
WASM Module Import Cleanup
schema-engine/connectors/sql-schema-connector/src/flavour/postgres/connector/wasm/mod.rs
Removed the now-unused ADVISORY_LOCK_TIMEOUT import following the removal of timeout-based locking.
Test Updates
schema-engine/sql-migration-tests/tests/migrations/advisory_locking.rs
Added Queryable import; refactored three concurrent test branches to use `.map(

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: replacing pg_advisory_lock with pg_try_advisory_lock to address deadlock issues.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description clearly relates to the changeset, explaining the motivation (deadlock prevention), the technical problem (pg_advisory_lock blocking), the proposed solution (pg_try_advisory_lock), and the implementation approach.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@schema-engine/connectors/sql-schema-connector/src/flavour/postgres.rs`:
- Around line 383-386: The current parsing of pg_try_advisory_lock result (the
let acquired = ... expression) uses unwrap_or(false), which treats parsing
failures as "lock not acquired"; change this to explicitly distinguish parsing
errors from a contended lock by: attempt to parse the boolean from
result.first().and_then(|row| row.at(0).and_then(|v| v.as_bool())) and if None
or a parsing error occurs, return/propagate a distinct error (e.g., a
parse/driver error) instead of treating it as false; only treat a successfully
parsed false as "lock contended". Ensure the code paths and error messages that
use the acquired variable (the parsing expression and the subsequent contention
check around pg_try_advisory_lock) are updated to handle and surface the parse
error separately.
- Around line 378-395: Extract the hardcoded advisory lock id (72707369) into a
named const (e.g., ADVISORY_LOCK_ID) and use that const when constructing the
SQL passed to connection.query_raw and when composing the
ConnectorError::from_msg string (and update the nearby comment to reference the
const as well); ensure the SQL string is formatted to include the const value
and the error message is built with the same const so the id is not duplicated
in literals anywhere in this block (look for connection.query_raw(...) and
ConnectorError::from_msg(...) to apply the change).

In `@schema-engine/sql-migration-tests/tests/migrations/advisory_locking.rs`:
- Around line 100-138: The test postgres_advisory_lock_contention_fails_fast
currently may leak the advisory lock if unwrap_err() or the assertion panics;
wrap the dedicated connection/lock_holder in a small RAII guard (e.g.,
LockHolderGuard) that holds the Quaint connection and issues "SELECT
pg_advisory_unlock(72707369)" in its Drop implementation so the unlock always
runs, replace the explicit tok(lock_holder.raw_cmd(...)) call with constructing
the guard, and relax the assertion by checking for the stable parts of the error
(e.g., "Another instance" plus the production constant if available such as an
ADVISORY_LOCK message constant in postgres.rs, otherwise keep a minimal
substring) instead of coupling to the literal "pg_try_advisory_lock".
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ce5a656e-4499-4de0-bab8-f45c81e38fe3

📥 Commits

Reviewing files that changed from the base of the PR and between 3c6e192 and 3315714.

📒 Files selected for processing (3)
  • schema-engine/connectors/sql-schema-connector/src/flavour/postgres.rs
  • schema-engine/connectors/sql-schema-connector/src/flavour/postgres/connector/wasm/mod.rs
  • schema-engine/sql-migration-tests/tests/migrations/advisory_locking.rs

Comment thread schema-engine/connectors/sql-schema-connector/src/flavour/postgres.rs Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@schema-engine/connectors/sql-schema-connector/src/flavour/postgres.rs`:
- Around line 391-397: The error message in the ConnectorError creation within
the postgres.rs file hardcodes the lock ID value `72707369` instead of using the
`MIGRATE_ADVISORY_LOCK_ID` constant that is already used elsewhere in the same
file. Replace the hardcoded `72707369` string in the error message with a
reference to the `MIGRATE_ADVISORY_LOCK_ID` constant to maintain consistency
across the codebase.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: f63404af-35f2-40e2-b702-abb9b5908b39

📥 Commits

Reviewing files that changed from the base of the PR and between 3f00f18 and d3ccc56.

📒 Files selected for processing (1)
  • schema-engine/connectors/sql-schema-connector/src/flavour/postgres.rs

@onno-vos-dev onno-vos-dev force-pushed the issue-5755-change-advisory-lock-to-try-advisory-lock branch from d3ccc56 to 77cc896 Compare June 16, 2026 14:31
… concurrency issues when multiple nodes compete for the same advisory lock
@onno-vos-dev onno-vos-dev force-pushed the issue-5755-change-advisory-lock-to-try-advisory-lock branch from 77cc896 to 04768ef Compare June 16, 2026 14:33

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@schema-engine/sql-migration-tests/tests/migrations/advisory_locking.rs`:
- Around line 128-131: Add crosstarget-utils as a workspace dependency to
schema-engine/sql-migration-tests/Cargo.toml to enable timeout functionality.
Then wrap the await on the
me.apply_migrations(&migrations_directory).send().await call with a short async
timeout using the repository's established timeout pattern from
crosstarget-utils. This ensures that if the code regresses to blocking behavior
on the advisory lock, the test will fail quickly at the regression point rather
than hanging until the suite-level timeout.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 620861b6-5745-4509-92f1-c3fdcf7f9158

📥 Commits

Reviewing files that changed from the base of the PR and between d3ccc56 and 77cc896.

📒 Files selected for processing (3)
  • schema-engine/connectors/sql-schema-connector/src/flavour/postgres.rs
  • schema-engine/connectors/sql-schema-connector/src/flavour/postgres/connector/wasm/mod.rs
  • schema-engine/sql-migration-tests/tests/migrations/advisory_locking.rs

Comment on lines +128 to +131
let err = tok(async { me.apply_migrations(&migrations_directory).send().await }).unwrap_err();

// Release the lock so the test database teardown is not blocked.
tok(lock_holder.raw_cmd("SELECT pg_advisory_unlock(72707369)")).unwrap();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Read-only check for existing async timeout patterns and Tokio time feature availability.
# Expectation: Either existing Rust tests already use tokio::time::timeout/time::timeout,
# or Cargo.toml enables Tokio's `time`/`full` feature for the relevant test workspace.

rg -n -C3 'tokio::time::timeout|time::timeout' --type=rust
rg -n -C3 'tokio.*features|features\s*=\s*\[[^]]*(time|full)' Cargo.toml

Repository: prisma/prisma-engines

Length of output: 2922


🏁 Script executed:

# Check the specific Cargo.toml for the test workspace
find . -path "*/schema-engine/sql-migration-tests/Cargo.toml" -type f

Repository: prisma/prisma-engines

Length of output: 112


🏁 Script executed:

# Look at the test file context around lines 128-131
sed -n '120,140p' schema-engine/sql-migration-tests/tests/migrations/advisory_locking.rs

Repository: prisma/prisma-engines

Length of output: 835


🏁 Script executed:

# Check if crosstarget_utils is a dependency of the test workspace
rg -n 'crosstarget' schema-engine/sql-migration-tests/Cargo.toml

Repository: prisma/prisma-engines

Length of output: 47


🏁 Script executed:

# Check the tokio dependency in the test workspace's Cargo.toml
cat schema-engine/sql-migration-tests/Cargo.toml | grep -A 5 '^\[dependencies\]' | head -50

Repository: prisma/prisma-engines

Length of output: 234


🏁 Script executed:

# Also check for any dev-dependencies that might include timeout utilities
rg -n 'tokio|timeout' schema-engine/sql-migration-tests/Cargo.toml -A 2

Repository: prisma/prisma-engines

Length of output: 155


🏁 Script executed:

# Check the workspace-level tokio definition in the root Cargo.toml
rg -n 'tokio.*features' Cargo.toml | head -5

Repository: prisma/prisma-engines

Length of output: 116


🏁 Script executed:

# Get more context around the tokio dependency in the workspace
rg -n 'tokio\s*=' Cargo.toml -A 3 | head -20

Repository: prisma/prisma-engines

Length of output: 270


🏁 Script executed:

# Check what workspace members exist and if crosstarget_utils is available
fd -t f 'Cargo.toml' | grep -E '(workspace|crosstarget)' | head -10

Repository: prisma/prisma-engines

Length of output: 99


🏁 Script executed:

# Check if crosstarget_utils is available as a workspace dependency
rg -n 'crosstarget' Cargo.toml | head -20

Repository: prisma/prisma-engines

Length of output: 121


🏁 Script executed:

# Look for how other tests handle timeouts or async waits
rg -n 'tokio::time|timeout|Duration' schema-engine/sql-migration-tests/tests/ -t rust | head -20

Repository: prisma/prisma-engines

Length of output: 3082


🏁 Script executed:

# Check the crosstarget_utils timeout implementation
cat libs/crosstarget-utils/src/native/time.rs

Repository: prisma/prisma-engines

Length of output: 858


🏁 Script executed:

# Check if crosstarget_utils is already a dependency of sql-schema-connector or related packages
rg -n 'crosstarget' schema-engine/ -t toml | grep -v test

Repository: prisma/prisma-engines

Length of output: 381


🏁 Script executed:

# Check how the test file imports and uses utilities
head -50 schema-engine/sql-migration-tests/tests/migrations/advisory_locking.rs

Repository: prisma/prisma-engines

Length of output: 1698


Use crosstarget_utils timeout for fail-fast regression detection.

Line 128 holds the conflicting advisory lock and then awaits apply_migrations with no deadline; if this regresses back to a blocking lock, the test hangs until the suite-level timeout instead of failing at the regression point. Wrap only this await in a short async timeout.

To implement this, add crosstarget-utils as a dependency to the test workspace and wrap the await using the repository's established timeout pattern:

🧪 Proposed test hardening
-    let err = tok(async { me.apply_migrations(&migrations_directory).send().await }).unwrap_err();
+    let err = tok(async {
+        crosstarget_utils::time::timeout(
+            std::time::Duration::from_secs(5),
+            me.apply_migrations(&migrations_directory).send(),
+        )
+        .await
+        .expect("apply_migrations did not fail fast while the advisory lock was held")
+    })
+    .unwrap_err();

Also add to schema-engine/sql-migration-tests/Cargo.toml:

crosstarget-utils.workspace = true
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@schema-engine/sql-migration-tests/tests/migrations/advisory_locking.rs`
around lines 128 - 131, Add crosstarget-utils as a workspace dependency to
schema-engine/sql-migration-tests/Cargo.toml to enable timeout functionality.
Then wrap the await on the
me.apply_migrations(&migrations_directory).send().await call with a short async
timeout using the repository's established timeout pattern from
crosstarget-utils. This ensures that if the code regresses to blocking behavior
on the advisory lock, the test will fail quickly at the regression point rather
than hanging until the suite-level timeout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants