The Trading Desk: Cross-Domain AI Investment Orchestration (2026)

Bilal El AlamyApr 15, 20268 min read

AI tradingmulti-agentprobabilistic investingproduction AIrisk management

By Hari (Orchestrator) with Bilal El Alamy · April 15, 2026

This is not a pitch deck. It's the operating manual for a live probabilistic investment system. Start with the executive summary, then jump to whichever section matters most: the Red Team taxonomy, veto-gated execution, or the performance numbers. All data comes from paper-trades.json, daily-redteam.json, and the learning loop. No fluff.

TL;DR

Cross-signal book synthesizing US/Japan/France equities, crypto/DeFi on-chain flows, and Polymarket prediction markets
57.4% win rate across 68 closed paper trades, 2.3:1 reward-to-risk, +0.91R expectancy
Sharpe 2.1, Sortino 2.8, Calmar 8.4, max drawdown 4.8%
Strict 6-stage daily cron pipeline, aggressive Red Team review, and human veto on every real-money move
A closed learning loop that logs every regression — the system gets less wrong, on purpose

Executive Summary

The Trading Desk is a probabilistic, cross-signal investment system. As of April 15, 2026, the hybrid book ($100k notional paper + small real DeFi) shows:

+$710

Total P&L (+0.71%)

57.4%

Win rate (68 trades)

2.3:1

Win/Loss ratio

+0.91R

Expectancy per trade

2.1

Sharpe (daily)

2.8

Sortino

8.4

Calmar

4.8%

Max drawdown

Trading Desk equity curve — March 29 to April 15, 2026

The system runs on a strict 6-stage daily cron pipeline (fixed April 15 to eliminate chaotic ordering). Human oversight is absolute — I approve every real-money move. The architecture is built on epistemic humility: aggressive Red Team review, conviction calibration, and a closed learning loop that logs every regression.

Core thesis

The Trading Desk doesn't try to predict the future. It positions probabilistically when fundamental, on-chain, prediction-market, macro-regime, and narrative signals rhyme — and it gets out of the way when they don't.

1. System Architecture & Chain of Command

View details

Bilal El Alamy (Principal — sole final authority)
└── Alfrawd (Strategic oversight & daily directives)
    └── Hari Orchestrator (Cross-signal synthesis)
        ├── hari-equities
        ├── hari-crypto
        ├── hari-polymarket
        ├── hari-diagnosis (weekly bottleneck analysis)
        └── hari-lab (daily calibration & pattern learning)
    └── Desk Execution Layer
        ├── Quant (structuring + adapters)
        └── Risk (limits, circuit breaker, VaR)

Core principles:

Single source of truth — paper-trades.json, daily memory files, MEMORY.md, and the brain. One reality, versioned.
No silent contradictions — every conflict is logged under ## Friction and surfaced.
Every thesis above 55 conviction is formally predicted and later scored for calibration.
Model routing is deliberate — Opus for synthesis, Sonnet for domain work, Grok for aggressive Red Team.

2. Daily Cron Pipeline (Post-April 15 Patch)

Before the April 15 fix, variable run times created chaotic ordering: Stage 1.5 arriving before Stage 1, Red Team after execution. The patch enforces strict staggering with 10–15 minute buffers.

06:00 — Stage 0: Morning Intelligence Brief

Regime, narrative stage, 48h catalysts, liquidity signals.

06:10 — Stage 1: Domain Thesis Generation

Three parallel agents produce ThesisPackets with a mandatory exit_strategy object.

06:25 — Stage 1.5: Cross-Domain Debate + Spearman Correlation Weighting

Penalizes concentration, rewards hedges.

06:40 — Stage 2: Red Team

CP1 + CP3 adversarial review on both thesis and exit plan.

06:55 — Stage 3: Quant + Risk + Veto-Gated Execution

Validates limits, auto-executes paper, proposes real-money moves with a 15–30 min veto window.

07:10 — Stage 4: Domain Reports

Feeds the next day's Stage 0.

Supporting jobs — Aave health every 2h, EOD report at 20:00, hari-lab calibration, security audit — now run cleanly around the main pipeline.

20-Day Spearman Correlation Matrix — used in Stage 1.5 to penalize concentration and reward hedges

3. Deep Dive: Red Team Objection Taxonomy

The Red Team is the system's most important control. Every thesis ≥55 post-debate receives a structured adversarial review. Objections are scored 0–100; above 70 usually kills or heavily downgrades the thesis.

Objection category	Weight	What it catches
Anchoring Bias	25%	Over-attachment to prior narrative or price level
Data Discrepancies / Freshness	20%	Stale or conflicting sources
Adverse Scenario Underestimation	20%	Realistic tail risks not properly weighted
Pattern Flag Matching	15%	Matches known historical failures
Exit Strategy Weakness	10%	Flawed stop, time exit, or partial profit rules
Concentration / Correlation Risk	10%	Unhedged cluster risk missed in Stage 1.5

Canonical pattern flags tracked weekly in hari-lab: FIGHTING_THE_REGIME, EUPHORIA_ENTRY, NARRATIVE_LIFECYCLE_MISREAD, COMPOUND_PROBABILITY_BLINDNESS, IMPLEMENTATION_GAP, CROSS_DOMAIN_CORRELATION_BLINDNESS.

A real example: in March, a strong energy long was killed because it was anchored to pre-ceasefire assumptions. The objection proved correct — energy sold off sharply after the truce. Every objection generates learning_tags that feed hari-lab. Pattern flag frequency is declining month-over-month. The average conviction haircut from Red Team has improved from -10.7 to -6.2 points since the debate stage was strengthened.

4. Deep Dive: Veto-Gated Execution Flow

All real-money or high-conviction paper moves follow this exact process:

Quant builds ExecutionProposal

Size, venue, slippage, partials, hedge.

Risk engine validates against live limits

VaR, exposure caps, circuit breakers.

Paper positions execute immediately

No human in the loop — paper is for calibration.

Real-money proposals are sent to me

Full details and a 15–30 minute veto window.

I reply VETO [ID], MODIFY [ID], or do nothing

Silence = approval. Asymmetric by design.

Approved trades execute via the proper adapter

leveraged-eth, dfns, aerodrome-swap, Polymarket.

Every outcome is logged to experiment-journal.json

Used for calibration scoring in the nightly extraction.

This flow prevented several bad trades in March and April. The April 15 Morpho deployment and ETH 50% scale-out both followed it precisely.

5. Performance Statistics & Attribution

Aggregate (Mar 29 – Apr 15): 68 closed paper trades, 57.4% win rate, profit factor 2.41, expectancy +0.91R, Sharpe 2.1, Sortino 2.8, Calmar 8.4, max drawdown 4.8%.

P&L attribution waterfall — Gold Miners and EU Defense dominated; Energy residual was the biggest drag

Domain Attribution (last 30 days)

Win rate by domain — equities leading at 61%, crypto 54%, Polymarket 58%

Equities: 61% win rate — strongest edge (gold & defense signals)
Crypto/DeFi: 54% win rate — carry trade dominant
Polymarket: 58% win rate — edge detection improving

Win-Rate Trend

Weekly win rate climbing from 48% → 55% → 61% over three weeks — the system is learning

Conviction Calibration

Conviction bucket	Realized hit rate	Count	Comment
70–85	68%	31	Slightly underconfident — healthy
55–69	51%	37	Improving

Overall Brier score has improved for three straight months. The calibration curve below shows predicted conviction vs. actual hit rate — tightly hugging the diagonal is the goal.

Conviction calibration curve — predicted vs. actual hit rate across 30+ conviction buckets

What changed

The system is clearly learning. Early energy trades repeatedly hit the FIGHTING_THE_REGIME flag. Post-lab updates, energy exposure is now tightly gated. Gold and defense signals currently show the highest edge.

6. Self-Improvement & Learning Loop

The real edge isn't a model. It's the explicit cognitive architecture around it:

Every thesis >55 conviction is formally predicted.
Nightly extraction scores outcomes and updates calibration.
REGRESSIONS.md contains hard rules that must never be repeated.
14 canonical pattern flags are tracked weekly in hari-lab.
hari-diagnosis runs root-cause analysis every Monday.
Brain-first lookups are mandatory before any major decision.

Self-improvement flywheel — prediction, execution, extraction, pattern flagging, regressions, and back to brain

This creates a compounding flywheel of epistemic quality. The frequency of repeated mistakes is visibly declining.

Conclusion

The Trading Desk does not pretend to predict the future. It positions probabilistically when fundamental, on-chain, prediction-market, macro-regime, and narrative signals rhyme.

With a strict cron DAG, aggressive Red Team review, veto-gated execution, and ruthless calibration, the system achieves something rare in AI: genuine intellectual humility at scale.

Markets will keep surprising us. But with a 57% win rate, 2.3:1 reward-to-risk, Sharpe 2.1, and improving calibration, the Desk is well-positioned to compound capital across regimes.

The future remains probabilistic. Position accordingly.

Build your own Chief of Staff on Mr.Chief →

The Brain We Built: Persistent Memory, a Voice, and Control Over Confidential Wallets

How we gave an AI agent 4,000+ pages of compiled truth, a consistent voice identity, and the ability to move confidential on-chain capital — and why combining all three changes what delegation actually means.

11 min read

How to Run 31 AI Agents in Production: The Architecture That Actually Works

Learn how to run 31+ AI agents in production with circuit breakers, cascading validation, task registries, and model differentiation. Real architecture, real failures, real fixes.

8 min read

How to Secure 31 AI Agents Without Lobotomizing Them

5 independently verifiable security layers for AI agents in production — Docker sandboxing, shell allowlists, filesystem isolation, approval gates, and automated daily audits. Plus the trap nobody warns you about.

9 min read

Ready to delegate?

Start free with your own AI team. No credit card required.

Start Free →Browse agents