perf: don't apply fancy indexing if split can be a slice.#235
perf: don't apply fancy indexing if split can be a slice.#235selmanozleyen wants to merge 4 commits into
Conversation
for more information, see https://pre-commit.ci
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #235 +/- ##
==========================================
- Coverage 93.48% 91.92% -1.57%
==========================================
Files 15 15
Lines 1397 1399 +2
==========================================
- Hits 1306 1286 -20
- Misses 91 113 +22
🚀 New features to boost your workflow:
|
Just to be clear, then sel = slice(sel[0], sel[-1] + 1)becomes a length-1 slice? And this is faster than fancy-indexing? Feels like a bug almost (not with us) |
Yes its setup: preloadnchunks=x,batchsize=1,chunksize1. |
|
I checked and for numpy it creates a copy in case of fany indexing of the size of the index. Vs compared to slices which only returns a view. Since each element is 40mb even allocating one row can be costly. In short if its a slice its a view if its a list its a copy view vs copy. Makes sense that they don't have special cases because you would want consistent behaviour. |
In my use case when batch_size=1, splits becomes something like this = [[0],[1],[2],...]. For each split I think this becomes a fancy indexing operation and can be costly. In my case every row is 40mb. And I want to have some preloaded rows but turns out each in_memory_data[split] is costly even though it fetches for one row.
This is a bit more of a generalized case. But I added
(split[-1] - split[0] == len(split) - 1check so it returns early but if you think we should avoid this check we can also just have a special case forbatch_size=1