Skip to content

feat(ai): score fast apply code quality#3118

Open
RitwijParmar wants to merge 3 commits into
onlook-dev:mainfrom
RitwijParmar:codex/onlook-apply-quality-assessment
Open

feat(ai): score fast apply code quality#3118
RitwijParmar wants to merge 3 commits into
onlook-dev:mainfrom
RitwijParmar:codex/onlook-apply-quality-assessment

Conversation

@RitwijParmar

@RitwijParmar RitwijParmar commented Jun 10, 2026

Copy link
Copy Markdown

Summary

Closes #3114. Adds an opt-in quality assessment layer for fast apply code changes. The scorer can run before the provider call to estimate risk from the update snippet, then run again on the merged output returned by Morph or Relace.

It checks for:

  • placeholder code left in generated output
  • dropped exported or declared symbols
  • rough syntax delimiter balance
  • instruction coverage
  • high edit density
  • risky browser/runtime patterns

The API exposes assessCodeChange, shouldBlockApply, and applyCodeChangeWithQuality, so the UI can show score/confidence/risk before applying changes without forcing one hard product policy.

Notes

I also typed the Relace response shape and changed existing empty-value fallbacks from || to ?? in the touched apply client.

Testing

  • bun test packages/ai/test/apply-quality.test.ts
  • bunx eslint packages/ai/src/apply/client.ts packages/ai/src/apply/index.ts packages/ai/src/apply/quality.ts packages/ai/test/apply-quality.test.ts
  • bunx tsc --noEmit --pretty false --moduleResolution bundler --module ESNext --target ES2022 --strict --skipLibCheck packages/ai/src/apply/client.ts packages/ai/src/apply/quality.ts packages/ai/test/apply-quality.test.ts

I also tried bun --filter @onlook/ai typecheck, but it fails on existing unrelated repo issues in web-client path aliases and older tests, not in this change.

Summary by CodeRabbit

  • New Features

    • Added a quality-aware apply flow that pre-assesses edits, can block unsafe changes with a clear block reason, and returns a post-apply assessment when edits are applied.
  • Improvements

    • Safer provider response handling and more resilient retry/error behavior with improved fallback and logging.
  • Tests

    • Added tests covering assessment signals, gating behavior, placeholders, missing exports, configurable thresholds, and related edge cases.

Signed-off-by: Ritwij Aryan Parmar <ritwij.aryan.parmar@gmail.com>
@vercel

vercel Bot commented Jun 10, 2026

Copy link
Copy Markdown

@RitwijParmar is attempting to deploy a commit to the Onlook Team on Vercel.

A member of the Team first needs to authorize it.

@vercel vercel Bot temporarily deployed to Preview – docs-onlook June 10, 2026 17:38 Inactive
@vercel

vercel Bot commented Jun 10, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs-onlook Skipped Skipped Jun 11, 2026 7:55pm

Request Review

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fccee65c-292b-433e-b144-99a80e1fc552

📥 Commits

Reviewing files that changed from the base of the PR and between a567cec and f4b3fd5.

📒 Files selected for processing (2)
  • packages/ai/src/apply/quality.ts
  • packages/ai/test/apply-quality.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/ai/test/apply-quality.test.ts
  • packages/ai/src/apply/quality.ts

📝 Walkthrough

Walkthrough

Adds a new quality assessment module (assessCodeChange, shouldBlockApply), integrates preflight/result assessments into applyCodeChangeWithQuality with optional gating, tweaks provider response handling and apply error retry behavior, re-exports quality APIs, and adds tests covering assessment signals and gating rules.

Changes

Code Quality Assessment Integration

Layer / File(s) Summary
Quality assessment core
packages/ai/src/apply/quality.ts
Defines ApplyCodeChangeRisk, public assessment/option types, placeholder/risky-pattern lists and stop-words; implements assessCodeChange() (edit density, instruction coverage, placeholder/risky detection, missing exported symbol detection, syntax-balance scoring), shouldBlockApply() gating, and helper utilities.
Quality-integrated apply client and provider fixes
packages/ai/src/apply/client.ts, packages/ai/src/apply/index.ts
Adds ApplyCodeChangeQualityOptions and ApplyCodeChangeWithQualityResult exports and applyCodeChangeWithQuality() orchestration (preflight assessment, optional blocking via shouldBlockApply(), call to applyCodeChange, result assessment). Adjusts provider handling: Morph response uses ?? null, Relace response is typed and returns mergedCode ?? null, applyCodeChange records lastError and continues provider attempts before final throw/return; barrel export re-exports ./quality.
Assessment and gating tests
packages/ai/test/apply-quality.test.ts
Adds fixture and tests covering low-risk focused edits, placeholder detection blocking, missing exported symbol detection, ignoring local-symbol drops, placeholder-only-in-input handling, and relaxed gating options for exploratory edits.

Sequence Diagram(s)

sequenceDiagram
  participant caller
  participant applyCodeChangeWithQuality
  participant assessCodeChange
  participant shouldBlockApply
  participant applyCodeChange
  participant resultAssessment
  caller->>applyCodeChangeWithQuality: request apply with quality
  applyCodeChangeWithQuality->>assessCodeChange: preflight assessment
  assessCodeChange-->>applyCodeChangeWithQuality: preflight result
  applyCodeChangeWithQuality->>shouldBlockApply: check gate conditions
  alt blocked by gate
    applyCodeChangeWithQuality-->>caller: return blocked result
  else proceed
    applyCodeChangeWithQuality->>applyCodeChange: apply code change
    applyCodeChange-->>applyCodeChangeWithQuality: applied code
    applyCodeChangeWithQuality->>resultAssessment: assess applied output
    resultAssessment-->>applyCodeChangeWithQuality: result assessment
    applyCodeChangeWithQuality-->>caller: return success with assessments
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

A rabbit hops through heuristic trails,
Counts tokens, scans for risky tales,
Scores the diff and checks the gate,
Pauses code before its fate—
A thoughtful hop keeps changes hale! 🐇✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding code quality scoring for fast apply code changes.
Description check ✅ Passed The description covers the feature, includes the related issue (#3114), specifies the type of change (new feature), details testing steps, and provides notes on additional improvements.
Linked Issues check ✅ Passed The implementation meets issue #3114 objectives: provides confidence scoring, scope assessment via edit-density, flags problematic code patterns (placeholders, dropped symbols), and gates application without enforcing hard policy.
Out of Scope Changes check ✅ Passed All changes directly support the quality assessment feature: new quality module, client integration, quality-aware apply function, comprehensive tests, and typing improvements align with PR objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint install failed. For unrecoverable errors, disable the tool in CodeRabbit configuration.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/ai/src/apply/client.ts (1)

143-146: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fallback provider is never attempted after first failure.

At Line 145, throwing inside the loop exits immediately, so the second provider in providerAttempts is never tried.

Suggested fix
+    let lastError: unknown = null;
     // Run provider attempts in order of preference
     for (const { provider, applyFn } of providerAttempts) {
         try {
             const result =
                 provider === FastApplyProvider.MORPH
                     ? await (applyFn as typeof applyCodeChangeWithMorph)(
                           originalCode,
                           updateSnippet,
                           instruction,
                       )
                     : await applyFn(originalCode, updateSnippet, instruction, metadata);
             if (result) return result;
         } catch (error) {
             console.warn(`Code application failed with provider ${provider}:`, error);
-            throw error;
+            lastError = error;
+            continue;
         }
     }

+    if (lastError) throw lastError;
     return null;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/ai/src/apply/client.ts` around lines 143 - 146, The catch inside the
providerAttempts loop currently rethrows the error (catch { console.warn(...);
throw error; }) which stops the loop and prevents trying fallback providers;
change the catch to log the error and continue to the next provider (do not
rethrow inside the loop), and after the loop finishes without a successful
result, throw a new aggregated error (or the last error) to surface failure;
update the catch block that references provider and providerAttempts so it only
logs the failure for that provider and allows the loop to try the next one.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/ai/src/apply/quality.ts`:
- Around line 101-103: The current logic computes placeholders and risky
patterns from codeToAssess only (using findPlaceholders and findRiskyPatterns)
and then treats them as generated concerns; instead compute placeholders and
riskyPatterns for both codeToAssess and originalCode, then derive the newly
introduced items by set-difference (newPlaceholders = placeholders(codeToAssess)
- placeholders(originalCode), likewise for riskyPatterns) and use those
newPlaceholders/newRiskyPatterns where the code treats generated concerns (the
sections around scoreSyntaxBalance and the handling at lines ~163-166). For
syntax balance (scoreSyntaxBalance) consider using the delta or only penalize if
the balance worsened relative to originalCode. Ensure you update any variable
names used later (e.g., placeholders, riskyPatterns) to reference the "new" sets
so only newly introduced issues are counted.
- Around line 267-273: The current extractDeclaredSymbols function is too broad
(it picks up local vars and text in comments/strings) and causes false positives
for missingOriginalSymbols; restrict extraction to top-level exported
declarations or switch to an AST-based approach: update extractDeclaredSymbols
to only capture exported/top-level identifiers (e.g., look only for "export"
declarations or parse the file with a JS/TS parser like `@babel/parser` /
TypeScript compiler API to collect top-level exported function/class/const/type
names) and then use that narrower set where missingOriginalSymbols or the
blocking logic (the code that defaults to block on missingOriginalSymbols) is
consulted so local/inline declarations and commented text are ignored. Ensure
references to extractDeclaredSymbols and the blocking check that uses
missingOriginalSymbols are updated accordingly.

---

Outside diff comments:
In `@packages/ai/src/apply/client.ts`:
- Around line 143-146: The catch inside the providerAttempts loop currently
rethrows the error (catch { console.warn(...); throw error; }) which stops the
loop and prevents trying fallback providers; change the catch to log the error
and continue to the next provider (do not rethrow inside the loop), and after
the loop finishes without a successful result, throw a new aggregated error (or
the last error) to surface failure; update the catch block that references
provider and providerAttempts so it only logs the failure for that provider and
allows the loop to try the next one.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1421f9f3-0e0d-4bda-b434-b6ee3537d1a5

📥 Commits

Reviewing files that changed from the base of the PR and between 936b015 and 6409d48.

📒 Files selected for processing (4)
  • packages/ai/src/apply/client.ts
  • packages/ai/src/apply/index.ts
  • packages/ai/src/apply/quality.ts
  • packages/ai/test/apply-quality.test.ts

Comment thread packages/ai/src/apply/quality.ts Outdated
Comment thread packages/ai/src/apply/quality.ts
@vercel vercel Bot temporarily deployed to Preview – docs-onlook June 11, 2026 19:31 Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/ai/src/apply/quality.ts`:
- Around line 279-280: The export-symbol extraction treats "export default"
declarations as stable named symbols causing false missing-symbol reports;
update the symbol extraction logic that uses the regexes
(/^export\s+(?:default\s+)?(?:async\s+)?(?:function|class|interface|type|enum|const|let|var)\s+([A-Za-z_$][\w$]*)/gm
and the export list regex) to special-case default exports: do not treat "export
default <identifier|expression>" as a named exported symbol for missing-symbol
checks, instead detect "export default" forms and either skip generating a
missing-symbol block for them or map them to a canonical default marker (e.g.,
"default") so renames of internal identifiers (e.g., local function/class names)
do not trigger dropped-symbol warnings; adjust the extraction and downstream
comparison code paths that reference the regex match group (the captured
identifier) to handle this special-case.
- Around line 214-216: The current check uses assessment.blockingConcerns.length
to decide blocking, which ignores gate.blockOnMissingSymbols and therefore
blocks when the only concern is missing-symbols; update the condition so you
first filter assessment.blockingConcerns to remove missing-symbols concerns when
gate.blockOnMissingSymbols is false (e.g. const relevant =
assessment.blockingConcerns.filter(c => !(c.type === 'missing_symbols' &&
!gate.blockOnMissingSymbols'))) and then use relevant.length > 0 in the if that
references gate.blockHighRisk and assessment.blockingConcerns (look for the if
using gate.blockHighRisk and assessment.blockingConcerns in quality.ts).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8d34a9af-3e21-46bd-bc03-12caca4ebc78

📥 Commits

Reviewing files that changed from the base of the PR and between 6409d48 and a567cec.

📒 Files selected for processing (3)
  • packages/ai/src/apply/client.ts
  • packages/ai/src/apply/quality.ts
  • packages/ai/test/apply-quality.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • packages/ai/test/apply-quality.test.ts
  • packages/ai/src/apply/client.ts

Comment thread packages/ai/src/apply/quality.ts Outdated
Comment thread packages/ai/src/apply/quality.ts Outdated
@vercel vercel Bot temporarily deployed to Preview – docs-onlook June 11, 2026 19:55 Inactive
@RitwijParmar

Copy link
Copy Markdown
Author

Checked this again.

The CodeRabbit threads are already addressed in a567cec and f4b3fd.

Focused Bun test passes with 8 tests.

The remaining red status is Vercel auth only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[feat] AI-generated code quality score — show confidence and diff clarity before applying changes

1 participant