Model-Based Actionline Input Assumptions

This note documents the current developer-facing assumptions for POST /api/fuzzy-actionline-model-based.

It is not a public API contract. It reflects the behavior implemented in ModelActionlineService and the controller as of the current tree.

Endpoint

Route: POST /api/fuzzy-actionline-model-based
Body must be valid JSON.
Invalid JSON returns HTTP 400.
Session-construction failures and invalid actionline requests also return HTTP 400 with success=false.

Supported Request Shapes

The endpoint accepts two shapes.

1. Strategy-act-like request

This is the preferred shape.

{
  "actionline": "F-F-F-F-C-C-/-R-C-/-R-C",
  "protocol_ver": "v2",
  "disable_bet_randomness": false,
  "hand": {
    "players": [
      {"seat_no": 0, "stack": 5325, "hole_cards": "AsAc4h5h"},
      {"seat_no": 1, "stack": 5000}
    ],
    "big_blind": 100,
    "ante": 20,
    "dealer_seat": 2,
    "sb_seat": -1,
    "bb_seat": -1,
    "straddle_seat": -1,
    "game_type": "plo4"
  }
}

Assumptions:

A request is treated as strategy-act-like when top-level hand exists and is an object.
protocol_ver is normalized to "v2" if missing or empty.
hand.actions.entries is normalized to an empty array if missing.
The request is passed through the existing buildGameState(...) path, so compatible /lookup fields such as mtt and squid may also be present.
hand.big_blind is used for response sizes_bb scaling. If absent or invalid, scaling falls back to 100.

2. Legacy flat request

This is compatibility mode.

{
  "actionline": "R-C-F-F-/-R-C-/-R",
  "num_players": 4,
  "nblinds": 2,
  "ante": 10,
  "stack": 120,
  "game_type": "nlhe",
  "disable_bet_randomness": false
}

Assumptions:

num_players defaults to 2.
nblinds defaults to 2.
ante defaults to 0.
stack defaults to 100.
game_type defaults to "nlhe".
In this path, ante and stack are interpreted in big-blind units unless a setup override is provided.

Actionline Parsing Assumptions

Parsing is character-based.
Dashes - are ignored.
R, r, B, b are normalized to raise tokens.
C, c, X, x are normalized to call/check tokens.
F, f are normalized to fold tokens.
/ is treated as a street separator token.
Any other character is ignored.

Implications:

"R-R-F-F-C-/-C-R" and "rrffc/cr" normalize to the same token stream.
The parser does not validate that the token stream is poker-legal before stepping the engine.
/ does not force a street transition on its own. It is only preserved in the output token stream while the underlying game engine controls actual street advancement.

Session Construction Assumptions

Strategy-act-like path

hand.players is expected to be present and compatible with the game-state builder.
If no player contains explicit hole cards, a default hero hand is injected into players[0].
Default injected hero cards are:
- nlhe: AsKh
- plo4: AsAc4h5h
- plo5: AsAc4h5h9s
- plo6: AsAc4h5h9s2s
The injected hero hand exists only so universal-poker can build a concrete session for a hand-agnostic sizing query.

Legacy flat path

The service builds a strategy-act-style hand object internally and then uses the same game-state machinery as the strategy-act-like path.
dealer_seat is synthesized as the last seat.
For nblinds >= 2, seats are synthesized as:
- sb_seat = 0
- bb_seat = 1
- straddle_seat = 2 when nblinds >= 3 and at least 3 players exist, otherwise -1
For single-blind mode:
- sb_seat = -1
- bb_seat = dealer_seat
- straddle_seat = -1
If no setup.players are supplied, all players receive equal stacks of stack * big_blind.
If no explicit hero hole cards are supplied, players[0] gets the same default hero cards listed above.

`setup` Override Assumptions

The legacy flat path may include a setup object.

Supported effective overrides:

setup.bigBlind: chip-denominated big blind for session construction and sizes_bb scaling.
setup.smallBlind: chip-denominated small blind.
setup.ante: chip-denominated ante. This overrides the flat ante field.
setup.numBlinds: overrides nblinds.
setup.players: if present and size is at least 2, it overrides num_players.
setup.players[i].stack: chip-denominated stack override for seat i.
setup.players[i].hole_cards or setup.players[i].hole: explicit hole cards for seat i.
setup.subGameType: may override top-level game_type routing for plo4, plo5, plo6, and shortdeck.

Important unit assumption:

Flat stack and flat ante are in big-blind units.
setup.bigBlind, setup.smallBlind, setup.ante, and setup.players[i].stack are in chips.

Randomness Assumptions

disable_bet_randomness=true forces deterministic raise selection.
disable_bet_randomness=false enables two kinds of variability:
- engine-side betsize perturbation through betsize_eps
- sampling across decoded legal raise amounts
The endpoint does not invent custom raise amounts.
The endpoint only picks or samples among engine-decoded legal raise-to chip amounts.
Multiple raw raise buckets that decode to the same legal chip amount are collapsed before selection.

Implications:

disable_bet_randomness=true is the switch that forces deterministic output.
disable_bet_randomness=false does not guarantee visible variation on every state.
A request may still look deterministic when the game state only has one effective decoded legal raise amount after engine constraints.

`request_id` Assumptions

request_id is optional.
String and integer request_id values are accepted.
When randomness is enabled, request_id is hashed into the serving hand_id used by the engine-side betsize perturbation path.
When randomness is disabled, the serving hand_id is forced to 0.
request_id does not currently make the full endpoint output deterministic by itself.
Repeating the same request with the same request_id may still produce different outputs when disable_bet_randomness=false, because the final decoded raise amount is still sampled.

Legal Raise Selection Assumptions

For each R token, the service runs model inference on the current observation.
It collects legal sized raise actions from the observation mask.
Each legal sized raise action is decoded through the engine.
Legal actions that decode to the same chip amount are merged.
Deterministic mode picks the merged decoded amount with the highest total model probability.
Random mode samples among merged decoded amounts by total model probability.
All-in is used only as a fallback when no legal sized raise exists.

Response Assumptions

The endpoint returns:

actionlines: one raw actionline string with chip-denominated raises
strategy-grid-fmt: one grid-format actionline string
sizes_bb: one entry per R token, scaled by the effective big blind
sizes_chips: one chip-denominated raise-to amount per R token
confidence: model support score when available
success

The response currently returns decoded chip amounts, not raw model bucket ids.

Example Fixtures

Current developer fixtures for this endpoint:

integration_tests/data/nlhe_random_action.json
integration_tests/data/plo4_random_action.json
integration_tests/data/plo5_random_action.json
integration_tests/data/plo6_random_action.json

PLO5 example

{
  "actionline": "F-F-F-F-C-C-/-R-C-/-R-C",
  "protocol_ver": "v2",
  "disable_bet_randomness": false,
  "hand": {
    "players": [
      {
        "seat_no": 0,
        "stack": 5325,
        "hole_cards": "AsAc4h5h9s"
      },
      {
        "seat_no": 1,
        "stack": 5000
      },
      {
        "seat_no": 2,
        "stack": 9985
      },
      {
        "seat_no": 3,
        "stack": 35870
      },
      {
        "seat_no": 4,
        "stack": 10420
      },
      {
        "seat_no": 5,
        "stack": 6570
      },
      {
        "seat_no": 6,
        "stack": 28880
      }
    ],
    "big_blind": 100,
    "ante": 20,
    "dealer_seat": 2,
    "sb_seat": -1,
    "bb_seat": -1,
    "straddle_seat": -1,
    "game_type": "plo5"
  }
}

Model-Based Actionline Input Assumptions

On this page