Skip to content

Perf: buffer to avoid expensive fancy indexing for dense data#239

Open
selmanozleyen wants to merge 8 commits into
scverse:mainfrom
selmanozleyen:perf/buffer-to-avoid-expensive-fancy-indexing
Open

Perf: buffer to avoid expensive fancy indexing for dense data#239
selmanozleyen wants to merge 8 commits into
scverse:mainfrom
selmanozleyen:perf/buffer-to-avoid-expensive-fancy-indexing

Conversation

@selmanozleyen

@selmanozleyen selmanozleyen commented Jun 16, 2026

Copy link
Copy Markdown
Member

For the case of dense data (like in genomic data) and when the feature size is big. I noticed that when bs=1, cs=1, and preload_nchunks=120, lot's of time is being spent on in_memory_data[split] because creates a copy of the row instead of just selecting that row as a view. We can have inplace indexing if we have a buffer using np.take(out=buffer). Would also solve #235

I will give real life data once I run this branch

@selmanozleyen selmanozleyen added the run-gpu-ci Signal that gpu ci should be run label Jun 16, 2026
Comment thread src/annbatch/loader.py
use_pinned=self._preload_to_gpu,
)
in_memory_data = self._dense_split_buffer[:needed_len]
self._np_module.take(

@ilan-gold ilan-gold Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I am not 100% sure this a safe operation on the GPU because AFAIK, operations happen asynchronously. Thus you may hit this line while your model is fitting on a batch derived from in_memory_data but you are then overriding in_memory_data. #105 It may make sense to have a pool

@selmanozleyen selmanozleyen Jun 16, 2026

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, you are right. But how does a pool solve this? How can we know if the model is done with that data? Isn't copying here our only option? in_memory_data[slice(start, end)].copy()

@ilan-gold ilan-gold Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I guess that would be the only way. Althought that is a good point, the normal indexing on the GPU may copy without .copy(). I am not sure. I hadn't considered that - it might be worth checking.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pool was just spitballing.

@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 93.55%. Comparing base (eefa63c) to head (93a628b).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #239      +/-   ##
==========================================
+ Coverage   93.48%   93.55%   +0.06%     
==========================================
  Files          15       15              
  Lines        1397     1412      +15     
==========================================
+ Hits         1306     1321      +15     
  Misses         91       91              
Files with missing lines Coverage Δ
src/annbatch/loader.py 91.11% <100.00%> (+0.31%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-gpu-ci Signal that gpu ci should be run

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants