Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 3 additions & 5 deletions examples/strands_sglang/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ This example connects `slime` with [`strands-sglang`](https://github.com/horizon
`strands-sglang` bridges the gap by extending `strands` with SGLang's native `/generate` endpoint:

- Captures exact token IDs during generation (no retokenization drift)
- Automatically tracks `loss_mask` via `token_manager`
- Automatically tracks `loss_mask` via the `Rollout` tracker (`model.rollout`)
- Provides `ToolLimiter` for clean trajectory truncation

## Install Dependencies
Expand All @@ -22,11 +22,9 @@ This example connects `slime` with [`strands-sglang`](https://github.com/horizon
2. Go to slime folder: `cd /root/slime`
3. Install slime: `pip install -e . --no-deps`
4. Go to the example folder: `cd /root/slime/examples/strands_sglang`
5. Install other dependencies: `pip install -r requirements.txt`
5. Install `strands-sglang`: `pip install strands-sglang==0.4.2`

> NOTE: `strands-sglang` is under rapid development, so we recommend using the GitHub repo version: `strands-sglang @ git+https://github.com/horizon-rl/strands-sglang.git`

> NOTE: We use camel-ai's subprocess code interpreter for python code execution, which is NOT a good practice; it's just for convenience of this example.
> NOTE: The `execute_python_code` tool runs code via `subprocess_interpreter.py`, a self-contained interpreter vendored from camel-ai so this example does not depend on the full `camel-ai` package. It runs model-generated code in a local subprocess with **no isolation**, which is NOT a good practice; it is here only for the convenience of this example. Use a sandboxed interpreter (Docker, e2b, microsandbox, ...) for anything beyond local experimentation.

## Prepare Model

Expand Down
Empty file.
19 changes: 9 additions & 10 deletions examples/strands_sglang/generate_with_strands.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
# Updated with strands-sglang 0.3.2
import logging

from camel.interpreters import SubprocessInterpreter
from strands import Agent, tool
from strands_sglang import SGLangModel, ToolLimiter, get_client_from_slime_args
from strands_sglang.tool_parsers import HermesToolParser
Expand All @@ -10,6 +8,8 @@
from slime.rollout.sglang_rollout import GenerateState
from slime.utils.types import Sample

from .subprocess_interpreter import SubprocessInterpreter

logger = logging.getLogger(__name__)

SYSTEM_PROMPT = """
Expand All @@ -22,7 +22,6 @@
""".strip()

MAX_TOOL_ITERS = 5
MAX_TOOL_CALLS = None # No limit


@tool
Expand Down Expand Up @@ -51,7 +50,7 @@ async def generate(args, sample: Sample, sampling_params) -> Sample:
sampling_params=sampling_params,
)

tool_limiter = ToolLimiter(max_tool_iters=MAX_TOOL_ITERS, max_tool_calls=MAX_TOOL_CALLS)
tool_limiter = ToolLimiter(max_tool_iters=MAX_TOOL_ITERS)
agent = Agent(
model=model,
tools=[execute_python_code],
Expand All @@ -71,12 +70,12 @@ async def generate(args, sample: Sample, sampling_params) -> Sample:
sample.status = Sample.Status.TRUNCATED
logger.warning(f"TRUNCATED: {type(e).__name__}: {e}")

# Extract token trajectory from token_manager
tm = model.token_manager
prompt_len = len(tm.segments[0]) # system + user are first segment
sample.tokens = tm.token_ids
sample.loss_mask = tm.loss_mask[prompt_len:]
sample.rollout_log_probs = tm.logprobs[prompt_len:]
# Extract token trajectory from the rollout tracker
rollout = model.rollout
prompt_len = rollout.initial_prompt_length # system + user are the first segment
sample.tokens = rollout.token_ids
sample.loss_mask = rollout.loss_mask[prompt_len:]
sample.rollout_log_probs = rollout.logprobs[prompt_len:]
sample.response_length = len(sample.tokens) - prompt_len
sample.response = model.tokenizer.decode(sample.tokens[prompt_len:], skip_special_tokens=False)
# Tool iteration and tool call count are different because multiple parallel tool calls count as 1 iteration
Expand Down
2 changes: 0 additions & 2 deletions examples/strands_sglang/requirements.txt

This file was deleted.

Loading
Loading