Architecture
The Aleatoric Engine System Architecture
Section titled “The Aleatoric Engine System Architecture”Complete Architecture Diagram
Section titled “Complete Architecture Diagram”Full component graph (Mermaid)
graph TB subgraph "Client Layer" USER[User/Client Application] CLI[CLI Interface] SDK[Python SDK] end
subgraph "API Layer (FastAPI)" API[main.py - FastAPI App] SCHEMAS[schemas.py - Pydantic Models] SSE[sse.py - Server-Sent Events] API --> SCHEMAS API --> SSE end
subgraph "Driver Layer" DRIVER[drivers.py - Unified Driver] DRIVER_BATCH[Batch Mode] DRIVER_STREAM[Stream Mode] DRIVER --> DRIVER_BATCH DRIVER --> DRIVER_STREAM end
subgraph "Generation Layer (Core)" HSM[single_asset.py - HyperSynthReactor] SF[feed.py - SyntheticTelemetryUplink] EF[event_feed.py - EventMarketFeed] UM[unified_market.py - MarketOrchestrationCore] RUNNER[runner.py - Batch Runner] COMPONENTS[components.py - Market Components]
HSM --> COMPONENTS SF --> HSM EF --> HSM UM --> HSM RUNNER --> HSM end
subgraph "Core Math & Infrastructure" MATH[math.py - Stochastic Processes] INFRA[infrastructure.py - Chronometer, Latency] CONFIG[config.py - SimulationManifest]
GBM[GeometricBrownianMotion] MJD[MertonJumpDiffusion] OU[OrnsteinUhlenbeck]
SIMCLOCK[Chronometer] LATENCY[LatencyModel]
MATH --> GBM MATH --> MJD MATH --> OU
INFRA --> SIMCLOCK INFRA --> LATENCY end
subgraph "Venue Layer (KEY DIFFERENTIATOR)" VENUE_BASE[base.py - VenueProtocol Interface] VENUE_CONFIG[config.py - Venue Configurations]
BINANCE[binance.py - BinanceFundingModel] HYPERLIQUID[hyperliquid.py - HyperLiquidFundingModel] OKX[okx.py - OKXFundingModel] BYBIT[bybit.py - BybitFundingModel]
BINANCE_FEED[binance/feed.py - Binance Format] HL_FEED[hyperliquid/feed.py - HyperLiquid Format]
VENUE_BASE --> BINANCE VENUE_BASE --> HYPERLIQUID VENUE_BASE --> OKX VENUE_BASE --> BYBIT
BINANCE --> BINANCE_FEED HYPERLIQUID --> HL_FEED end
subgraph "Processing Layer" NORMALIZER[normalizer.py - CanonicalizationEngine] CACHE[ArtifactCacheController - LZ4 Compression] LOADER[hyperliquid_loader.py - Historical Data]
NORMALIZER --> CACHE NORMALIZER --> LOADER end
subgraph "Presets & Configuration" PRESETS[presets.py - Exchange Presets] BINANCE_PRESET[BinancePreset] HL_PRESET[HyperLiquidPreset] OKX_PRESET[OKXPreset] BYBIT_PRESET[BybitPreset]
PRESETS --> BINANCE_PRESET PRESETS --> HL_PRESET PRESETS --> OKX_PRESET PRESETS --> BYBIT_PRESET end
subgraph "Utility Layer" CONV[conversions.py - Unit Conversions] LOGGER[logger.py - Logging] ENUMS[enums.py - Enumerations] STUBS[stubs.py - Type Stubs] end
subgraph "Data Output" PARQUET[Parquet Files] LZ4_CACHE[LZ4 Compressed Cache] SSE_STREAM[SSE Event Stream] end
%% Client Connections USER --> API USER --> SDK CLI --> SDK SDK --> HSM SDK --> PRESETS
%% API Flow API --> DRIVER
%% Driver to Generator DRIVER --> HSM DRIVER_BATCH --> PARQUET DRIVER_STREAM --> SSE_STREAM
%% Generator Dependencies HSM --> MATH HSM --> INFRA HSM --> CONFIG HSM --> CONV HSM --> VENUE_CONFIG
%% Venue Integration HSM --> BINANCE HSM --> HYPERLIQUID HSM --> OKX HSM --> BYBIT
%% Presets to Config PRESETS --> CONFIG PRESETS --> VENUE_CONFIG
%% Processing Flow HSM --> NORMALIZER NORMALIZER --> LZ4_CACHE
%% Logging HSM --> LOGGER API --> LOGGER NORMALIZER --> LOGGER
style VENUE_BASE fill:#ff6b6b style BINANCE fill:#ff6b6b style HYPERLIQUID fill:#ff6b6b style OKX fill:#ff6b6b style BYBIT fill:#ff6b6b style VENUE_CONFIG fill:#ff6b6bComponent Details
Section titled “Component Details”1. Client Layer
Section titled “1. Client Layer”- Entry Points: Direct Python SDK usage, CLI commands, HTTP API clients
- Use Cases: ML training, backtesting, infrastructure testing
2. API Layer (FastAPI)
Section titled “2. API Layer (FastAPI)”- main.py: HTTP endpoints (
/data/generate,/stream/live) - schemas.py: Request/response validation (Pydantic)
- sse.py: Server-Sent Events for real-time streaming
- Authentication: API key-based security
3. Driver Layer
Section titled “3. Driver Layer”- Unified Driver: Single entry point for batch/stream modes
- Batch Mode: Generates Parquet files (optimized throughput)
- Stream Mode: Async SSE with real-time delays (realistic simulation)
- Determinism: Same seed = identical output across modes
4. Generation Layer
Section titled “4. Generation Layer”-
HyperSynthReactor: Core L2 orderbook + trade generator
- Price process: Merton Jump Diffusion (GBM + Poisson jumps)
- Orderbook: Dynamic depth, spread elasticity, adverse selection
- Trades: Poisson arrival, power-law sizing, clustering
- Microstructure: Burst mode, staleness, queue dynamics
-
SyntheticTelemetryUplink: Spot/Perp/Funding triangular model
-
EventMarketFeed: Multi-frequency event streams
-
MarketOrchestrationCore: Consolidated market generator
5. Core Math & Infrastructure
Section titled “5. Core Math & Infrastructure”-
math.py:
GeometricBrownianMotion: Standard diffusionMertonJumpDiffusion: Fat-tailed returnsOrnsteinUhlenbeck: Mean-reverting processes (funding rates)
-
infrastructure.py:
Chronometer: Deterministic time progressionLatencyModel: Network/processing delays (fixed, jittery)
-
config.py:
SimulationManifest: 40+ parameters for full reproducibility- Hash-based caching, JSON serialization
6. Venue Layer ⭐ KEY DIFFERENTIATOR
Section titled “6. Venue Layer ⭐ KEY DIFFERENTIATOR”Base Interface:
VenueProtocol: Abstract class for venue-specific feedsBookQuote,TradeTick: Generic containers
Funding Models (Exchange-specific):
- Binance: 8h intervals, interest rate (0.01%) + premium, ±0.75/2% caps
- HyperLiquid: 1h intervals, velocity coefficients (1.0-2.5x), ±4% cap
- OKX: 8h intervals, threshold damping, ±0.5/0.75% caps
- Bybit: 8h intervals, insurance fund effects, ±1/2.5% caps
Venue Feeds:
- Format converters to exact exchange JSON schemas
- Binance WebSocket format, HyperLiquid L2 format
7. Processing Layer
Section titled “7. Processing Layer”-
CanonicalizationEngine:
- Converts all sources to canonical schema
NormalizedBookEvent,NormalizedTradeEvent- Monotonic IDs, nanosecond timestamps
- Venue metadata preservation
-
ArtifactCacheController:
- LZ4 compression (15-20x reduction)
- MD5-based cache keys
- Metadata sidecars (JSON)
8. Presets & Configuration
Section titled “8. Presets & Configuration”7 Exchange Profiles:
binance_spot_btc: 100ms updates, tight spreadsbinance_spot_eth: High activity, moderate spreadsbinance_futures_btc: Leverage volatility, 8h fundinghyperliquid_perp_sol: 200ms updates, 1h funding, on-chainhyperliquid_perp_btc: Velocity coefficientsokx_spot_btc: Competitive spreadsbybit_futures_btc: 50ms updates, extreme bursts
9. Utility Layer
Section titled “9. Utility Layer”- conversions.py: BPS ↔ decimal ↔ percentage points, quote ↔ base
- logger.py: Structured logging
- enums.py: TradeSide, etc.
10. Data Flow
Section titled “10. Data Flow”Batch Generation:
User → API → Driver (batch) → HyperSynthReactor → → [Venue Model] → Events → Normalizer → Parquet + LZ4 CacheStream Generation:
User → API → Driver (stream) → HyperSynthReactor → → [Venue Model] → Events → SSE → ClientPreset Usage:
get_preset("binance_futures_btc") → SimulationManifest → → HyperSynthReactor → BinanceFundingModel → EventsTechnology Stack
Section titled “Technology Stack”Core Dependencies
Section titled “Core Dependencies”- pandas: Data manipulation (legacy compatibility)
- polars: High-performance DataFrames
- numpy: Numerical computations
- lz4: Fast compression
- fastapi: HTTP API framework
- uvicorn: ASGI server
Dev Dependencies
Section titled “Dev Dependencies”- pytest: Testing framework
- pytest-cov: Coverage reporting
- black: Code formatting
- isort: Import sorting
- flake8: Linting
- mypy: Type checking
Python Version
Section titled “Python Version”- Minimum: 3.9+
- Target: 3.9-3.12