Removing obsolete dataset statuses#1809
Open
ilongin wants to merge 3 commits into
Open
Conversation
4 tasks
Deploying datachain with
|
| Latest commit: |
6674909
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://4afb5c20.datachain-2g6.pages.dev |
| Branch Preview URL: | https://ilongin-1801-remove-obsolete.datachain-2g6.pages.dev |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
shcheklein
approved these changes
Jun 16, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR removes obsolete dataset status constants and updates dataset-version cleanup logic/docs to no longer reference the removed statuses.
Changes:
- Removed
PENDINGandSTALEfromDatasetStatus. - Updated “final status” detection to no longer treat
STALEas final. - Updated GC/cleanup documentation and query filters to drop
STALEfrom “versions to clean” selection.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
src/datachain/dataset.py |
Removes obsolete DatasetStatus values and updates final-status detection accordingly. |
src/datachain/data_storage/metastore.py |
Updates dataset-version GC docstring and the SQLAlchemy predicate for selecting versions to clean. |
src/datachain/catalog/catalog.py |
Updates cleanup API docstring to reflect the new status set. |
Comments suppressed due to low confidence (1)
src/datachain/data_storage/metastore.py:350
- If legacy PENDING/STALE statuses are still supported for GC (to avoid leaking old versions), the docstring should reflect that these statuses may be returned as eligible for cleanup; otherwise readers will assume only CREATED/FAILED/REMOVING are considered.
- Status CREATED, FAILED where either:
- the associated job has finished, or
- there is no associated job (job_id is NULL) and the version is
older than STALE_CREATED_THRESHOLD_HOURS
- Status REMOVING: marked for deletion
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
261
to
265
| class DatasetStatus: | ||
| CREATED = 1 | ||
| PENDING = 2 | ||
| FAILED = 3 | ||
| COMPLETE = 4 | ||
| STALE = 6 | ||
| REMOVING = 7 |
Comment on lines
369
to
373
| return self.status in [ | ||
| DatasetStatus.FAILED, | ||
| DatasetStatus.COMPLETE, | ||
| DatasetStatus.STALE, | ||
| DatasetStatus.REMOVING, | ||
| ] |
Comment on lines
1836
to
1840
| dv.c.status.in_( | ||
| [ | ||
| DatasetStatus.CREATED, | ||
| DatasetStatus.FAILED, | ||
| DatasetStatus.STALE, | ||
| DatasetStatus.REMOVING, |
Comment on lines
1153
to
1155
| Removes dataset versions that: | ||
| - Have status CREATED, FAILED, STALE, or REMOVING | ||
| - Have status CREATED, FAILED, or REMOVING | ||
| - Belong to completed/failed/canceled jobs (not running) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.