Inputs, Ticks, Encoding, and Signing
January 6, 2026 by Farms
In the last post I sketched out the shape of the state layer - a schema-first ECS-shaped binary buffer designed to skip serialization costs when crossing boundaries. But state is only half the story. The other half is getting player intents into the system in the first place.
The relay is meant to be game-agnostic. It just shuffles bytes around without understanding them. But clients need to encode meaningful actions - button presses, aim directions, RTS commands - into those bytes. And the system needs to know who sent them.
So what actually is an "input"? And how do we get it from a player's gamepad to everyone's simulation?
What is an Input?
An input is a player's intent for a given moment in time. It could be as simple as "W key pressed" or as complex as "move unit 42 to position (100, 200)".
I find it useful to split inputs into two categories: controls and commands.
Controls are real-time inputs sampled every tick. Things like button states, analog stick positions, aim directions. These are inherently "continuous" - the system wants to know the current state of the player's controller every single tick.
Commands are discrete one-shot actions. Think RTS games: select units, issue move orders, cast abilities. These happen at specific moments and might not occur every tick. A tick might have zero commands, or it might have several.
A schema for a game's inputs might look something like:
{
"controls": [
{ "name": "buttons", "type": "flags8", "options": ["w", "a", "s", "d", "jump", "fire"] },
{ "name": "aim", "type": "vec2" }
],
"commands": [
{ "name": "move", "args": [
{ "name": "unitId", "type": "uint16" },
{ "name": "pos", "type": "vec2" }
]}
]
}
Sound familiar? It's the same schema-first approach we used for state. Define the shape upfront, generate accessors for each language, everyone agrees on the binary layout.
Encoding Inputs on the Wire
Unlike the state format, where we willy-nilly just accept that allocating a massive fixed slab of bytes up front is an acceptable trade off for simplisity, our input format is going to be sent over the network.
Inputs are different. An input message where a player just holds W and nothing else should be tiny. A tick where they issue three RTS commands should be larger. We need a sparse encoding - only transmit fields that are actually set.
My desire to support command-like inputs with variable number of commands also means any kind of fixed size encoding is going to be impractical anyway.
The encoding is TLV (Type-Length-Value), where each field is:
The index identifies which field from the schema this is. The length says how many bytes follow. Simple, variable length, and only present if the field is set.
An input with W+jump pressed and an aim direction might encode as:
[0x00][0x01][0b00010001] // index 0 (buttons): 1 byte
[0x01][0x08][...8 bytes...] // index 1 (aim): 8 bytes (vec2)
That's 12 bytes total. An empty input - no buttons, no aim, no commands - encodes as zero bytes. Nice and compact.
The type system mirrors what we have for state. Primitives like bool, uint8, f32. Flag types like flags8 where each named option maps to a bit. Compound types like vec2 and quat for 3D-ish games. All multi-byte values are little-endian for consistency.
Semantic Hints
There's one thing inputs will need that state doesn't: hints about meaning.
When the network lags and we need to predict what a player will do next, it matters whether a button is "momentary" (held while pressed, likely to continue) or "toggle" (pressed once to flip state, don't predict the next press).
So controls can declare hints:
{ "name": "buttons", "type": "flags8", "hint": "momentary" },
{ "name": "crouch", "type": "bool", "hint": "toggle" },
{ "name": "aim", "type": "vec2", "hint": "axis" }
The prediction system can use these to make smarter guesses. A momentary input that was held last tick probably continues. A toggle input that fired last tick probably doesn't repeat. An axis value probably stays roughly similar.
Who Sent This Input?
It's not very useful if we don't know who sent the input. In our attempt to keep the server-side as simple as possible, we want to avoid building any kind of stateful player authentication / login system.
So we need to know whose input is whose. And we need to prevent players from spoofing each other's inputs. In my earlier post I hand-waved about signing inputs and using public keys as identity. Let's make that concrete.
Each input message includes a cryptographic signature. The player signs the input with their private key, and the signature is appended to the message:
The signature covers a hash of the session ID, target tick, and payload. Using secp256k1 (the same curve as Bitcoin/Ethereum) gives us 65-byte recoverable signatures.
The fun part of this setup is that player's identity is not transmitted explicitly. Instead, we recover the public key from the signature, and assume that the public key is the player's id.
pubKey, err := RecoverPubKey(messageHash, signature)
The mathematics of elliptic curve signatures let us extract the signer's public key from the signature itself. No need for a separate identity field. No need for a central authentication server. The player's 33-byte compressed public key is their identity, and we can derive it from any message they sign.
This is "self-certifying identity". A player generates a keypair, and that keypair is their identity. If they sign a message, we know it came from them. If they didn't sign it, they can't be impersonated.
It does mean players need to hold onto their private key if they want a persistent identity. But for quick throwaway .io games that's fine - generate a fresh keypair each session or stick the key in localStorage.
What is a Tick?
A tick is a bundle of all player input messages for a fixed timestep. The simulation advances one tick at a time, and for each tick it needs to know what every player intended to do.
The relay collects inputs from all players, stamps them with a tick number to commit them to the immutable log, and broadcasts the complete set back to everyone. Once a client has received a tick, it has everything it needs to compute the next state.
The tick is the unit of consensus. Everyone agrees: "for tick 47, these were the inputs". From there, deterministic simulation takes over.
on the wire a tick will be a list of the verbatim InputMessages (to preserve the input signatures), and the list will always include input messages for all joined players (even if that is an empty inputmesage).
By maintaining a stable order for the player inputs within the tick, we get the nice property that players have a stable "index" within the tick, something that we will need to easily map inputs to players.
However one downside of this decision is that it is going to be very easy to bust out of the network MTU size effectively putting a cap on the max number of players that can realistically participate in a games session!
Some rough back of envelope math suguests that we will probably be fine for a wide range of input types for around 16 players. And like I don't actually have anyone to play the games, sooo probably fine for my purposes :)
The Signature Chain
Individual input signatures prevent spoofing. But what about the tick log itself?
The relay maintains the canonical ordered log of all ticks. If someone tampered with that log, modified an old input, deleted a tick, reordered things, we'd have no way to detect it with just input signatures.
So the relay itself will also sign each tick, and the signatures form a chain:
Each tick's signature hash includes the previous tick's signature. Tick 0 uses zeros as the "previous" signature. This creates a hash chain anchored at genesis.
tickHash = SHA256(sessionId || tickNum || inputsHash || prevRelaySig)
relaySig = sign(relayKey, tickHash)
The security properties fall out naturally. Modifying any tick invalidates its signature. Deleting a tick breaks the chain. Reordering ticks puts the wrong tick number in the hash. Splicing ticks from another session fails because the session ID is in the hash.
It's not bulletproof - the relay itself could still lie. But it gives us some nice tamper resistence and more importantly means clients can verify the integrity of the log they receive, confirming they have a complete undamaged replica of the log.
Putting It Together
Sorry these notes got simultaniously too technical and somehow not technical enough. But here's the gist of the decisions:
- Player samples their controls, maybe queues a command
- Encode the input using a TLV format based on schema of input structure
- Sign the encoded payload with the player's private key
- Send to the relay with target tick number
- Relay collects all inputs for the tick, signs the tick
- Relay broadcasts the tick to all clients
- Clients decode, verify, and feed into their simulation
The schema-first approach means the same input definition generates TypeScript code for the client and (eventually) Rust code for the simulation. Everyone agrees on the bytes.