Skip to content

Roadmap

The project's current target capability is live trading: the system can ingest live market data, run a strategy, execute trades, and let the operator observe and intervene via the frontend — all running on the production k3s cluster without manual process management.

How agents use this file

Milestones are ordered. Do not treat a later milestone as the priority if an earlier one is incomplete.

When asked what to work on next, identify the earliest incomplete milestone and suggest concrete work that closes it. When finishing a task that may satisfy a milestone criterion, check the relevant milestone and update its status if the criteria are now met. Log anything newly discovered as a gap in ISSUES.md.


M1 · Stable Market Data Ingestion

Status: 🔲 In progress

Ready when: - All active ingestion paths write to QuestDB via ILP/HTTP — no pgwire ingestion paths remain in Ingester or any DataSource implementation - SetSubscribedInstrumentsTask and StartDataFlowTask complete successfully on a cold cluster start in production without operator intervention - Feed disconnects trigger automatic reconnect via RecoveryManager.recoverable() without requiring a pod restart - backend-recorder readiness is reported through DataBridge to backend-server and visible in Grafana

Currently blocked by: - pgwire is still used in some ingestion paths; ILP migration is incomplete

Why this matters: - Everything downstream — strategy execution, trade execution, historical backfill — depends on reliable, low-latency market data reaching QuestDB. Pgwire is a reliability risk under backpressure and must be eliminated before the pipeline can be trusted in production.

Agent actions: - When working in Ingester, any DataSource, or DataRecorder, verify whether pgwire is still in use and flag any remaining paths - When working in StartDataFlowTask or RecoveryManager, verify the reconnect and cold-start behavior and report whether the readiness criteria are met - If all criteria are met, mark this milestone complete and assess M2


M2 · Strategy Execution in Production

Status: 🔲 Not started

Ready when: - At least one strategy is registered in gradle.prod.properties and runs end-to-end through the Processor → StrategyExecutor → StrategyExecution path on live data in the production cluster - Strategy signals and output are forwarded over DataBridge and visible in frontend-app in real time - A strategy can be added, started, and stopped via the datastore without redeploying backend-app - StartDataFlowTask completes through to StartDataBridgeTask successfully on the production cluster

Currently blocked by: - M1 must be complete first - Unknown: assess by checking whether any strategy is registered in gradle.prod.properties and whether the full startup sequence reaches StartDataBridgeTask on prod

Why this matters: - The ingestion pipeline has no observable value until a strategy is consuming it. This milestone proves the processor path works end-to-end in production, not just in tests.

Agent actions: - When working in Processor, StrategyExecutor, or StrategyExecution, check whether these are running against live data in production and report status - When working in DataBridgeLauncher, check whether strategy signals and output are being forwarded and received by backend-server - If all criteria are met, mark this milestone complete and assess M3


M3 · Live Trade Execution

Status: 🔲 Not started

Ready when: - TradeExecutor can submit an order to a real broker and receive a confirmation, rejection, or partial-fill outcome without operator intervention - Rejected and partially-filled orders are handled by TradeExecutor and forwarded through DataBridge — the relevant DTOs exist in backend-data-bridge - Trade events and open positions are persisted through OrdersRepository and appear in frontend-app within one SSE cycle - TradeExecutor is wired into the backend-app startup sequence as a StartupTask or equivalent, not started manually

Currently blocked by: - M2 must be complete first - Broker integration is not yet wired into the startup sequence - Rejection and partial-fill paths do not yet exist in DataBridge DTOs

Why this matters: - All upstream work — ingestion, processor, strategy — is plumbing until an order can actually be placed. This is the milestone where the platform becomes a trading system rather than a data pipeline.

Agent actions: - When working in TradeExecutor, backend-broker, or DataBridge DTOs, check whether the trade round-trip criteria are met and report status - When working in backend-data-bridge, check whether rejection and partial-fill events are represented in the DTO surface - If all criteria are met, mark this milestone complete and assess M4


M4 · Operator Confidence Loop

Status: 🔲 Not started

Ready when: - The operator can observe live strategy state, open positions, trade events, and datasource health from frontend-app without SSHing into the cluster - Grafana dashboards cover ingestion lag, processor throughput, and DataBridge connection health — not just pod-level metrics - A failed pod restart (any of backend-app, backend-server, backend-recorder) restores trading state automatically without operator intervention, verified by BridgeResilienceK8sIntegrationTest or equivalent - A runbook exists documenting the response to the most likely failure modes: broker disconnect, QuestDB crash, pod eviction, and DataBridge reconnect failure

Currently blocked by: - M3 must be complete first - Grafana dashboard coverage is partial; ingestion lag and processor throughput are not yet tracked - The recovery event round-trip (DataBridgebackend-app) has not been validated end-to-end in production - No runbook exists yet

Why this matters: - Live trading without observability is unsafe. This milestone ensures the operator can detect problems and that the system recovers from the most common failure modes without manual action.

Agent actions: - When working on Grafana dashboards or Terraform monitoring resources, check whether ingestion lag and processor throughput are covered - When working on DataBridgeLauncher or recovery flows, check whether the recovery round-trip has been validated in production - If a runbook does not yet exist and M3 is complete, suggest creating one as the next step - If all criteria are met, mark this milestone complete and assess M5


M5 · Live Trading 🎯

Status: 🔲 Not started

Ready when: - M1 through M4 are all complete - The system has run in production without operator intervention for at least seven consecutive trading days - All four runbook scenarios (broker disconnect, QuestDB crash, pod eviction, DataBridge reconnect failure) have been exercised at least once, either in production or via BridgeResilienceK8sIntegrationTest - docs/ROADMAP.md reflects the completed milestones accurately

Why this matters: - This is the project's north star. Reaching M5 means the platform is operational, observable, and recoverable — a real personal quantitative trading system running on self-hosted infrastructure.

Agent actions: - When M1–M4 are all marked complete and the stability window has passed, mark this milestone complete - Surface this milestone status whenever asked about the overall project state