Skip to content

[UR][L0] Diagnose errors in host memory registration API#22370

Open
againull wants to merge 1 commit into
intel:syclfrom
againull:ur_host_mem_reg_diag
Open

[UR][L0] Diagnose errors in host memory registration API#22370
againull wants to merge 1 commit into
intel:syclfrom
againull:ur_host_mem_reg_diag

Conversation

@againull

Copy link
Copy Markdown
Contributor
  • The Level Zero driver does not reliably diagnose alignment errors for the external system memory mapping path: an unaligned pointer is reported as ZE_RESULT_ERROR_OUT_OF_HOST_MEMORY and an unaligned size is silently accepted. Validate host-page alignment of the pointer and size directly in the Level Zero adapter.
  • Validate null pointer and zero size, returning UR_RESULT_ERROR_INVALID_VALUE.
  • Reject a registration whose range [pHostMem, pHostMem + size) would overflow the host address space, returning UR_RESULT_ERROR_INVALID_VALUE.
  • Replace the bare assert that the mapped device VA equals the host pointer with a real runtime check: on mismatch, undo the mapping and return an error instead of silently handing back an unusable registration.
  • Handle UR_DEVICE_INFO_USM_HOST_ALLOC_REGISTER_SUPPORT_EXP in the L0 device info query, returning whether the external system memory mapping extension is supported. The enum existed but was previously unhandled.

Assisted-by: Claude

- The Level Zero driver does not reliably diagnose alignment errors for the
  external system memory mapping path: an unaligned pointer is reported as
  ZE_RESULT_ERROR_OUT_OF_HOST_MEMORY and an unaligned size is silently
  accepted. Validate host-page alignment of the pointer and size directly in the Level
  Zero adapter.
- Validate null pointer and zero size, returning UR_RESULT_ERROR_INVALID_VALUE.
- Reject a registration whose range [pHostMem, pHostMem + size) would overflow
  the host address space, returning UR_RESULT_ERROR_INVALID_VALUE.
- Replace the bare assert that the mapped device VA equals the host pointer
  with a real runtime check: on mismatch, undo the mapping and return an
  error instead of silently handing back an unusable registration.
- Handle UR_DEVICE_INFO_USM_HOST_ALLOC_REGISTER_SUPPORT_EXP in the L0
  device info query, returning whether the external system memory mapping
  extension is supported. The enum existed but was previously unhandled.

Assisted-by: Claude
@againull againull requested a review from a team as a code owner June 19, 2026 07:22

@pbalcer pbalcer left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, one nit.

Comment on lines +938 to 948

// This extension maps the existing host memory in place, so the mapped
// device virtual address must match the host pointer that was registered.
// If the driver ever returns a different address we cannot honor the
// contract, so undo the mapping and report a failure rather than silently
// handing back an unusable registration.
if (mappedMem != pHostMem) {
ZE_CALL_NOCHECK(zeMemFree, (hContext->getZeHandle(), mappedMem));
return UR_RESULT_ERROR_UNSUPPORTED_FEATURE;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit overzealous and too defensive. It would be a severe driver bug for this to happen. It's not an error the application can reasonably handle. I think the assert is the correct thing to do here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants