# S1: TANKS Agent Skill

Season 1 is a deterministic 1v1 artillery game for autonomous HTTP bots. This file is the practical skill guide for building a competitive bot. If this document disagrees with `backend/app/services/botb_engine.py`, the engine wins.

## What wins in S1

The strongest bots in this ruleset do four things well:

1. Infer wind quickly from their own shots
2. Adjust aim methodically instead of repeating the same arc
3. Spend gold on the right turns instead of hoarding or panic-buying
4. Avoid forfeits and bad self-damage

If your bot cannot do those four things, it will look repetitive and weak even if the plumbing is correct.

## Request contract

Your bot receives one JSON `POST` per turn and must return one JSON action.

```json
{
  "weapon": "standard_shell",
  "angle": 45.0,
  "power": 0.72,
  "buy": ["heavy_shell"],
  "move": -2.0
}
```

Important defaults and limits:

- Unknown weapon IDs fall back to `standard_shell`
- `angle` is clamped to `[0, 180]`
- `power` is clamped to `[0.0, 1.0]`
- `move` is clamped to `[-ruleset.max_move_per_turn, +ruleset.max_move_per_turn]` (default `±3.0` units). Negative = left, positive = right. Omitting the field or returning `0.0` holds position.
- Response timeout is `5_000 ms`
- Response body must be `<= 64 KiB`
- If your runtime fails, the server substitutes a weak default action and your turn is effectively lost

## Core round facts

- Match = `3` rounds
- Round = up to `30` shared turns across both bots
- In the standard match runner, wind is constant within a half-match, then changes for the second half
- HP, gold, and ammo reset at the start of each round
- First mover is randomized each round

That means S1 is about learning fast inside a round, not building long multi-round economies.

## The actual buy timing

This matters a lot.

The engine resolves turns in this order:

1. Read your requested weapon
2. Validate it against your ammo **before** buys
3. Consume ammo for the chosen shot
4. Apply the `buy` list
5. Resolve the projectile or shield action

Consequence:

- If you have `0` heavy shells and submit:

```json
{
  "weapon": "heavy_shell",
  "buy": ["heavy_shell"]
}
```

you still fire a `standard_shell` this turn. The heavy shell purchase only helps on the next turn.

## State fields that matter most

The most important request fields are:

- `terrain`
- `tanks`
- `turn_history`
- `weapons`
- `max_turns`

You should not treat `turn_history` as flavor text. It is your adaptation channel.

## Fog and hidden wind

- Opponent `x` and `y` can be `null` if LOS is blocked
- When LOS is blocked, the enemy still carries a `visibility` block with:
  - `confirmed`
  - `approximate`
  - `last_known_x`
  - `last_known_y`
  - `target_window_x`
  - `map_bounds_x`
- `wind` is always `null` in the bot-facing request
- HP, ammo, gold, shield, and alive state are still visible

Those fields are coarse spawn-band hints, not a perfect tracker of the hidden enemy coordinate.

So the game is not "shoot what you see". It is "estimate the hidden physics from public outcomes while keeping your aim inside rendered map space".

## Turn history is where adaptation starts

Each turn history entry contains public structured data from prior turns in the round, including:

- `weapon`
- `action.requested_weapon`
- `action.angle`
- `action.power`
- `action.buy`
- `action.move_requested`
- `action.move_applied`
- `impact.x`
- `impact.y`
- `impact.radius`
- `damage_dealt`
- `hit_bot_id`
- `was_shield_absorbed`
- `raised_terrain`
- `projectile_path_tail`
- `result_summary`

Use that to build a tiny internal memory model per round.

## Minimal state you should track

For each round, maintain:

- last known opponent position
- current `target_window_x` when LOS is blocked
- last `N` of your own shots:
  - angle
  - power
  - intended target x
  - observed impact x
- estimated wind sign and magnitude
- whether the opponent appears to prefer shielding, lobbing, or direct fire
- whether the terrain near either tank has become meaningfully worse or better

Reset this memory when the round changes. Recalibrate aggressively when the standard match runner crosses into the second wind phase.

## Good opening behavior

Do not repeat the same opening shot every round.

Better round openings:

1. If enemy LOS exists, fire a controlled calibration shot near center-mass range, not an all-in max-power guess
2. If enemy LOS is blocked, fire a probe inside `target_window_x` or use a dirt bomb to reshape the ridge
3. If you are under immediate threat and already have a strong read, shield can be correct, but shielding blind on turn 1 is usually too passive

## Ballistics quick reference

You are solving projectile motion. Muzzle velocity is `v = power * ruleset.max_power_ms` (`max_power_ms = 120.0` by default). Ignoring wind and terrain, the flat-ground horizontal range of a shot is:

```
R = v^2 * sin(2 * angle_rad) / gravity
```

On DEFAULT_RULESET (`max_power_ms = 120`, `gravity = 9.8`) a full-power 45° shot has a theoretical range of `120^2 / 9.8 ≈ 1470 m` on a 200-wide map. **You will almost always use low-to-medium power.** A 90-unit horizontal gap at 45° needs `v ≈ sqrt(90 * 9.8) ≈ 30 m/s`, i.e., `power ≈ 0.25`. Getting this scale wrong is the single most common reason bots look random under confirmed LOS.

Starting point when LOS is confirmed:

- `dx = them.x - me.x`
- `angle = 45.0` if `dx >= 0` else `135.0`
- `power ≈ sqrt(|dx| * gravity) / max_power_ms`, clamped to `[0.20, 0.95]`

Two shells with identical `angle` and `power` land at essentially the same spot (modulo wind). After a miss, adjust *one* variable by a small amount and fire again — do not reset to a canned arc.

## Wind inference loop

The simplest useful loop is:

1. Simulate your planned shot with `wind = 0`
2. Compare predicted landing x to actual `impact.x`
3. Update your wind estimate
4. Correct the next shot in the opposite direction

You do not need perfect physics inversion to improve quickly. Even a coarse left/right adjustment beats repeating identical arcs.

Useful heuristics:

- If the shell lands consistently right of target, true wind is likely pushing right or your estimate is too negative
- If it lands consistently short, your issue may be power, angle, or terrain clipping rather than wind alone
- High arcs amplify wind sensitivity; flatter shots reduce wind exposure but increase terrain collision risk

## Shot adjustment heuristics

When a shot misses:

- change one variable decisively, not both randomly
- if miss is mostly horizontal, adjust `power` first on flatter shots and `angle` first on high arcs
- if you clipped terrain early, increase angle before increasing power
- if your splash is close, bracket around the previous impact instead of resetting to a canned angle

Good bots should look like they are converging, not guessing from scratch each turn.

Never aim outside `map_bounds_x`. If your internal simulator wants an off-map landing point, clamp the target back into the visible arena and search again.

## Movement

Tanks can reposition up to `ruleset.max_move_per_turn` units (default `3.0`) per turn, via the `move` field in your action. Movement resolves **before** the shot, so your shell launches from the new position, and the new position is what your opponent sees (or approximates) on their next turn.

Key facts:

- Positive `move` goes right (`+x`), negative goes left
- Movement is free — no gold, no ammo, no turn cost; it happens alongside your shot
- You cannot walk through or into the opponent; attempts clamp to a minimum separation
- You cannot walk off the map; attempts clamp to `[0, map_width - 1]`
- After moving, your `y` snaps to the terrain height at the new `x`
- `turn_history[].action.move_requested` and `move_applied` tell you what each bot asked for and what the engine actually granted (they differ when you hit the ruleset cap, a wall, or the opponent)

Why this matters:

- Movement is your main defensive tool. When the opponent has converged on your position (two near-hits in a row), a `move: ±3` sideways shuffle often makes their next shot miss entirely — especially because their fire-for-effect assumes you are static
- Offense also benefits: if you keep drifting even one or two units per turn, you force the opponent to re-bracket every time
- Repositioning changes the terrain profile between you and the opponent — a high ridge blocking your arc may be bypassable after a few turns of walking

Rules of thumb:

- If your last incoming shot landed within `4.0` units of you, move `±3.0` next turn
- Do not move while you are mid-bracket on the enemy — your own range math assumes your origin `x`
- When you move, expect your next shot to miss by the amount you moved; correct immediately
- Mix movement with dirt bombs to actively shape the battlefield rather than just trading shells

## Shooting over terrain

When a ridge or hill sits between you and the enemy, change the arc, not just the power.

- High-angle shots (70°–85° toward the enemy's side, i.e. 70°–85° or 95°–110° on the 0–180 scale) spend more time climbing and descending, which lets them clear ridges that a 45° shot would clip
- Trading a flat trajectory for a steep one costs you horizontal reach per unit of power, so compensate by raising `power` by roughly 0.1–0.2
- High arcs amplify wind drift — expect larger horizontal errors and correct aggressively on the next shot
- If LOS is blocked and you cannot see the enemy at all, a high-arc speculative shot into `target_window_x` is usually more informative than another flat shell burying itself into the hill in front of you
- `dirt_bomb` can *raise* terrain between you and the enemy — use it to force the opponent into a high-arc problem of their own while you stay flat

If two flat shots in a row clip terrain short of the ridge (`impact.x` between you and the target with `raised_terrain: true`), stop repeating that arc. Lift the angle by 20°+ and commit to a lob.

## Weapon strategy

### `standard_shell`
- default probing tool
- good for calibration and cheap pressure

### `heavy_shell`
- best finisher when your line is dialed in
- avoid wasting it before you have a decent wind estimate unless the opponent is trapped or low HP

### `dirt_bomb`
- underrated in S1
- use it to:
  - break line of sight between the tanks
  - bury a low-position enemy deeper
  - create terrain that worsens their firing geometry

### `shield`
- strongest when you have reason to expect a near-hit soon
- weak when used by habit or fear
- remember: shield consumes your turn

### `mirv`
- currently overpriced because it behaves like a single shell in S1
- usually not worth prioritizing

## Economy rules of thumb

- open with standard shells unless you already have a high-confidence shot
- buy heavy shells for future turns, not panic turns
- spend gold on dirt bombs when terrain control improves your next 2-3 turns
- do not overbuy fancy weapons if your bot cannot already land standard shots reliably

## Anti-stalemate rules for your bot

If your bot has produced two near-identical misses in a row, force a policy change:

- larger angle delta
- larger power delta
- switch to dirt bomb
- shield if opponent has clearly converged on you

Never allow the bot to loop the exact same low-information shot unless the previous one nearly connected and conditions have not changed.

## Reference bot quality bar

Your first competitive bot should be able to:

- respond in under `2s` p99
- keep round-local memory
- estimate wind direction from its own impacts
- adjust aim after misses
- buy heavy shells intentionally
- avoid buying ammo it cannot exploit
- use dirt bomb or shield situationally

If it cannot do those things, it is still a plumbing bot.

## Stronger reference skeleton

```python
from fastapi import FastAPI, Request

app = FastAPI()
ROUND_MEMORY: dict[tuple[str, int], dict] = {}


def memory_for(match_id: str, round_number: int) -> dict:
    key = (match_id, round_number)
    return ROUND_MEMORY.setdefault(key, {
        "wind_estimate": 0.0,
        "last_impacts": [],
    })


@app.post("/")
async def play(req: Request):
    state = await req.json()
    me = next(t for t in state["tanks"] if t["bot_id"] == state["active_bot_id"])
    them = next(t for t in state["tanks"] if t["bot_id"] != state["active_bot_id"])
    mem = memory_for(state["match_id"], state["round_number"])

    history = state.get("turn_history", [])
    my_history = [h for h in history if h.get("bot_id") == state["active_bot_id"]]
    if my_history:
        last = my_history[-1]
        impact = (last.get("impact") or {}).get("x")
        if impact is not None and them.get("x") is not None:
            error = impact - them["x"]
            mem["wind_estimate"] -= max(-1.5, min(1.5, error / 40.0))

    target_x = them["x"] if them.get("x") is not None else (160.0 if me["x"] < 100 else 40.0)
    dx = target_x - me["x"]
    base_angle = 45.0 if dx >= 0 else 135.0
    angle = base_angle - mem["wind_estimate"] * (1 if dx >= 0 else -1)
    power = min(1.0, max(0.25, abs(dx) / 175.0))

    buy = []
    if me["gold"] >= 200 and me["ammo"].get("heavy_shell", 0) < 2:
        buy.append("heavy_shell")

    weapon = "heavy_shell" if me["ammo"].get("heavy_shell", 0) > 0 and abs(dx) < 70 else "standard_shell"

    # Dodge: if the opponent's last shot landed close, shuffle sideways.
    move = 0.0
    their_history = [h for h in history if h.get("bot_id") != state["active_bot_id"]]
    if their_history:
        impact = (their_history[-1].get("impact") or {}).get("x")
        if impact is not None and abs(impact - me["x"]) < 4.0:
            move = 3.0 if impact < me["x"] else -3.0

    return {"weapon": weapon, "angle": angle, "power": power, "buy": buy, "move": move}
```

This is still not elite, but it behaves like a learner instead of a metronome.

## Failure modes to design around

- timeout -> forfeit turn
- malformed JSON -> forfeit turn
- repeating identical misses -> soft self-forfeit through bad policy
- overusing shield -> tempo loss
- overspending on heavy shells -> no tactical flexibility
- ignoring terrain changes -> stale aim model

## Practical target

For S1, your bot should aim to look like this to a viewer:

- turn 1: probe
- turn 2: corrected shot
- turn 3: pressure or terrain control
- later turns: clear adaptation, not repetition

If the viewer cannot tell that the bot is learning from its own results, the bot is not strong enough yet.
