Strategies
Deterministic, step-by-step constraints that gate the model’s next token. Strategies are compiled once (using the active tokenizer) and then consulted each decoding step to compute an allowed/disallowed token set. The engine applies a large negative mask to disallowed logits, making only valid continuations feasible.
At runtime, every strategy implements:
start(tokenizer) -> stateallowed_tokens(state, tokenizer) -> set[int]disallowed_tokens(state, tokenizer) -> set[int]step(state, token_id, tokenizer) -> Noneis_complete(state) -> bool
The Self Prompt helper uses these signals to create an AdjustedLogits action per step.
Strategy Types
ChoicesStrat
Force the generation to choose exactly one string from a finite set. Multi-token choices are supported via a trie.
- Builds a token trie of all choices (
tokenizer.encodefor each string) allowed_tokensreturns the union of outgoing edges from the active frontier- When the frontier collapses to terminal nodes, the choice is complete
- Example:
["hello", "hello world"]→ first token must be “hello”; next may be end or “ world”
from quote_mod_sdk.strategies.strategy_constructor import ChoicesStrat
ChoicesStrat(["yes", "no", "unsure"]) # single-token or multi-token strings supportedUntilStrat
Constrain until a terminator appears.
end_type=UntilEndType.TAG: completes when the fullendstring appears in the decoded streamend_type=UntilEndType.ANYCHAR: completes when any character fromendis observedstart(optional): forces a prefix sequence before free generation (useful to anchor tags)- EOS is disallowed by default while running this strategy
from quote_mod_sdk.strategies.strategy_constructor import UntilStrat
from quote_mod_sdk.strategies.primitives import UntilEndType
# XML-like segment: emit content inside tags, then complete when closing tag appears
UntilStrat("<answer>", UntilEndType.TAG, "</answer>")
# Free-run until punctuation
UntilStrat("", UntilEndType.ANYCHAR, ".,!?\n")CharsStrat
Constrain to a character class with length control.
mode:CharsMode.ALPHA | ALPHANUMERIC | NUMERIC | STRINGmin: minimum number of characters before completion allowedstop: either a maximum count (int) or a specific stop token (str) to stop at- Uses decoded token text to filter allowed token IDs per step
from quote_mod_sdk.strategies.strategy_constructor import CharsStrat
from quote_mod_sdk.strategies.primitives import CharsMode
# Exactly 4 digits
CharsStrat(CharsMode.NUMERIC, stop=4, min=4)
# At least 2 alphanumerics, stop when we see a dot
CharsStrat(CharsMode.ALPHANUMERIC, stop=".", min=2)ListStrat
Constrain to a list of elements with optional wrappers, separators, and an end suffix.
- Structural tokens:
open,close,wrap(per element),sep,end_with - Cardinality:
minandmaxnumber of elements (default:0..∞) - Elements: either a single element strategy or a fixed list of element strategies
- If
elementsis a list,minandmaxare set to the list length and evaluated sequentially
- If
- Phases:
in_open → await_element → (in_wrap_open) → in_element → (in_wrap_close) → await_sep → in_separator → (in_close) → (in_end_with) - Completion: when
close(and optionalend_with) is fully consumed andminelements satisfied
from quote_mod_sdk.strategies.strategy_constructor import ListStrat, ChoicesStrat
# CSV-like list of choices
ListStrat(
elements=ChoicesStrat(["red", "green", "blue"]),
open="[", close="]", sep=", ", wrap='"', end_with="\n",
min=1, max=3,
)
# Fixed 3-field tuple: <alpha>-<digits>-<alpha>
from quote_mod_sdk.strategies.primitives import CharsMode
ListStrat([
CharsStrat(CharsMode.ALPHA, stop=1, min=1),
CharsStrat(CharsMode.NUMERIC, stop=3, min=3),
CharsStrat(CharsMode.ALPHA, stop=2, min=2),
], sep="-")Usage
from quote_mod_sdk.strategies.strategy_constructor import (
ChoicesStrat, UntilStrat, CharsStrat, ListStrat
)
from quote_mod_sdk.strategies.primitives import UntilEndType, CharsMode
country = ChoicesStrat(["US", "CA", "GB"]) # classification
digits = CharsStrat(CharsMode.NUMERIC, min=2, stop=5) # 2..5 digits
csv = ListStrat(elements=ChoicesStrat(["A", "B", "C"]), sep=", ") # CSV-of-choicesThese strategies are typically used via SelfPrompt/self_prompt_mod, which compiles the strategy with the runtime tokenizer and applies constraints during decoding.
How Masking Works
At each ForwardPass step, SelfPrompt gathers:
allowed = strategy.allowed_tokens(state, tokenizer)disallowed = strategy.disallowed_tokens(state, tokenizer)
Then it adjusts the logits with a mask value (default -1e9), yielding an AdjustedLogits action. If only disallowed is provided for a step, the helper masks just those tokens; otherwise it masks everything except the allowed set. Set argmax_sampling=True on SelfPrompt to force deterministic sampling (token_temp=0).
Completion Semantics
- ChoicesStrat: complete when exactly one choice is matched and no further edges remain
- UntilStrat: complete when
endis observed (TAG) or when any stop char is observed (ANYCHAR) - CharsStrat: complete when
stopis reached (int length or stop token), respectingmin - ListStrat: complete after
close(and optionalend_with) andminelements satisfied; fixed-list mode completes after the last element
Dynamic Updates
When using SelfPrompt with ChoicesStrat (or ListStrat containing ChoicesStrat elements), you can update options at runtime:
sp.refresh_responses(["Yes", "No"], request_id)
# or update a particular element in a ListStrat by index
sp.refresh_responses(["A", "B", "C"], request_id=request_id, idx=0)Tips & Pitfalls
- Tokenization matters: a “character” could be a multi-byte token; CharsStrat filters by decoded text
- For UntilStrat TAG mode, provide the full closing tag; for ANYCHAR, include every terminating char
- Use
wrapin ListStrat to constrain quotes or brackets around each element - If
allowed/disallowedsets are empty, the helper leaves logits unchanged for that step - Prefer strategies over post-hoc backtracking when the schema is known — they’re more efficient and predictable
The engine uses these to mask logits each step via AdjustedLogits.