Skip to content

Determinism

Given the same SimulationManifest (including the same seed), you get the same event sequence whether you:

  • generate a batch dataset, or
  • consume a stream of the same simulation.

This is the foundation for reproducible experiments.

Determinism pipeline: same manifest and seed produce identical batch and stream outputs.


Given:

  • Config C (with Seed S)
  • Generation Function G(C)

Then: Events_Batch = G_batch(C) Events_Stream = G_stream(C)

Assert: Events_Batch == Events_Stream

This property is useful when you want a model, strategy, or pipeline to see the same conditions across environments.


The HyperSynthReactor initializes its internal random number generators (NumPy RandomState or Python random) using the provided seed at instantiation.

Both Batch and Stream drivers utilize the same underlying generator instance (market.stream()).

  • Batch consumes the generator greedily.
  • Stream consumes the generator lazily, inserting asyncio.sleep() calls between yields.

Because the underlying math (GBM steps, orderbook logic) happens inside market.stream() before the driver decides to sleep or append, the sequence of internal states remains identical.

We prioritize integer-based timestamps (timestamp_ms) and consistent floating-point operations within the single Python process to avoid non-deterministic behavior often seen in multi-threaded or distributed simulations.


The guarantee is enforced by the test suite in tests/test_determinism.py.

async def test_batch_stream_parity():
# 1. Generate Batch
batch_file = await unified_driver(config, mode="batch", ...)
batch_events = read_parquet(batch_file)
# 2. Generate Stream
stream_gen = await unified_driver(config, mode="stream", ...)
stream_events = [json.loads(e) async for e in stream_gen]
# 3. Assert Equality
assert len(batch_events) == len(stream_events)
for b, s in zip(batch_events, stream_events):
assert b == s

This test is part of the CI/CD pipeline and must pass for any release.