Studio: OS level sandbox for python/bash tool execution (macOS + Linux)#83
Studio: OS level sandbox for python/bash tool execution (macOS + Linux)#83danielhanchen wants to merge 7 commits into
Conversation
for more information, see https://pre-commit.ci
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements an OS-level sandbox for tool execution on macOS and Linux using Seatbelt and bubblewrap, integrating it into the Python and Bash execution paths. It also adds RTL language support to chat composers via the dir="auto" attribute, cleans up legacy CI environment variables, and introduces comprehensive tests for sandbox enforcement and multilingual input. Review feedback identified a potential AttributeError in sys.modules iteration and suggested a more robust approach for managing bound destinations in the Linux sandbox configuration.
| for name, mod in list(sys.modules.items()): | ||
| if not (name.startswith("__editable___") and name.endswith("_finder")): | ||
| continue |
There was a problem hiding this comment.
The sys.modules dictionary can contain None values (representing negative caching). Accessing attributes on a NoneType object will raise an AttributeError. It is safer to check if mod is not None before proceeding.
| for name, mod in list(sys.modules.items()): | |
| if not (name.startswith("__editable___") and name.endswith("_finder")): | |
| continue | |
| for name, mod in list(sys.modules.items()): | |
| if mod is None or not (name.startswith("__editable___") and name.endswith("_finder")): | |
| continue |
| bound_dests = [ | ||
| args[i + 2] | ||
| for i, arg in enumerate(args) | ||
| if arg in bind_flags and i + 2 < len(args) | ||
| ] |
There was a problem hiding this comment.
This logic for extracting bound destinations assumes that every flag in bind_flags is followed by exactly two arguments (source and destination). While this is true for bwrap's bind flags, it makes the code fragile if the structure of args changes or if other flags are added. Consider a more robust way to track bound destinations as they are added to the args list.
| ok = False | ||
| label = "no sandbox primitive for this platform" | ||
|
|
||
| _sandbox_available_cache = ok |
| ) | ||
| try: | ||
| page.wait_for_load_state("networkidle", timeout = 30_000) | ||
| except Exception: |
| step("wait for composer to mount") | ||
| try: | ||
| page.wait_for_load_state("networkidle", timeout = 30_000) | ||
| except Exception: |
| ) | ||
| try: | ||
| shoot(f"02-composer-wait-attempt-{_mount_attempt + 1}-fail") | ||
| except Exception: |
| composer.element_handle(), | ||
| timeout = 5_000, | ||
| ) | ||
| except Exception: |
| if os.path.islink(prefix): | ||
| seen_links.add(prefix) | ||
| out.append(prefix) | ||
| except OSError: |
| """ | ||
| secret = f"SECRET-{uuid.uuid4().hex}" | ||
| path = os.path.expanduser(f"~/.studio_sandbox_test_{uuid.uuid4().hex}.txt") | ||
| Path(path).write_text(secret) |
Staging mirror of unslothai#5468
Original PR: unslothai#5468
Author: NilayYadav
This is a staging copy for review and editing. Once finalized, changes will be pushed back to the original PR.
Original description
Summary
python_exec/bash_exectool subprocesses in an OS level sandbox: Seatbelt (sandbox-exec) on macOS, bubblewrap (bwrap) on LinuxWhy
The python and bash tools previously ran as plain subprocesses, a tool call could read anywhere the Studio user could read, write outside the workdir, and reach the network. Seatbelt ships with macOS and bubblewrap is a one-package install on every major Linux distro.
Testing
pytest -q studio/backend/tests/test_sandbox.py~/Desktopdenied,urllib.requestraises, write inside workdir succeedsapt install bubblewrap) — python tool:ls /shows only the bound dirs (bin sbin usr lib lib64 etc proc dev tmp home), DNS failsThis PR tracks the moving review branch (pr-5468-head). Iteration fix commits land here directly. Review-added tests are in a separate PR.
Changed files:
.github/workflows/consolidated-tests-ci.yml.github/workflows/lint-ci.yml.github/workflows/mlx-ci.yml.github/workflows/notebooks-ci.yml.github/workflows/release-desktop.yml.github/workflows/security-audit.yml.github/workflows/stale.yml.github/workflows/studio-api-smoke.yml.github/workflows/studio-backend-ci.yml.github/workflows/studio-frontend-ci.yml.github/workflows/studio-inference-smoke.yml.github/workflows/studio-mac-api-smoke.yml.github/workflows/studio-mac-inference-smoke.yml.github/workflows/studio-mac-ui-smoke.yml.github/workflows/studio-mac-update-smoke.yml.github/workflows/studio-tauri-smoke.yml.github/workflows/studio-ui-smoke.yml.github/workflows/studio-update-smoke.yml.github/workflows/studio-windows-api-smoke.yml.github/workflows/studio-windows-inference-smoke.yml.github/workflows/studio-windows-ui-smoke.yml.github/workflows/studio-windows-update-smoke.yml.github/workflows/version-compat-ci.yml.github/workflows/wheel-smoke.ymlstudio/backend/core/inference/sandbox.pystudio/backend/core/inference/tools.pystudio/backend/run.pystudio/backend/startup_banner.pystudio/backend/tests/test_sandbox.py