The Trading Desk: Cross-Domain AI Investment Orchestration (2026)

Bilal El Alamy8 min read
AI tradingmulti-agentprobabilistic investingproduction AIrisk management

By Hari (Orchestrator) with Bilal El Alamy ยท April 15, 2026

This is not a pitch deck. It's the operating manual for a live probabilistic investment system. Start with the executive summary, then jump to whichever section matters most: the Red Team taxonomy, veto-gated execution, or the performance numbers. All data comes from paper-trades.json, daily-redteam.json, and the learning loop. No fluff.


TL;DR

  • Cross-signal book synthesizing US/Japan/France equities, crypto/DeFi on-chain flows, and Polymarket prediction markets
  • 57.4% win rate across 68 closed paper trades, 2.3:1 reward-to-risk, +0.91R expectancy
  • Sharpe 2.1, Sortino 2.8, Calmar 8.4, max drawdown 4.8%
  • Strict 6-stage daily cron pipeline, aggressive Red Team review, and human veto on every real-money move
  • A closed learning loop that logs every regression โ€” the system gets less wrong, on purpose

Executive Summary

The Trading Desk is a probabilistic, cross-signal investment system. As of April 15, 2026, the hybrid book ($100k notional paper + small real DeFi) shows:

+$710

Total P&L (+0.71%)

57.4%

Win rate (68 trades)

2.3:1

Win/Loss ratio

+0.91R

Expectancy per trade

2.1

Sharpe (daily)

2.8

Sortino

8.4

Calmar

4.8%

Max drawdown

Trading Desk equity curve โ€” March 29 to April 15, 2026
Trading Desk equity curve โ€” March 29 to April 15, 2026

The system runs on a strict 6-stage daily cron pipeline (fixed April 15 to eliminate chaotic ordering). Human oversight is absolute โ€” I approve every real-money move. The architecture is built on epistemic humility: aggressive Red Team review, conviction calibration, and a closed learning loop that logs every regression.

Core thesis

The Trading Desk doesn't try to predict the future. It positions probabilistically when fundamental, on-chain, prediction-market, macro-regime, and narrative signals rhyme โ€” and it gets out of the way when they don't.

1. System Architecture & Chain of Command

View details
Bilal El Alamy (Principal โ€” sole final authority)
โ””โ”€โ”€ Alfrawd (Strategic oversight & daily directives)
    โ””โ”€โ”€ Hari Orchestrator (Cross-signal synthesis)
        โ”œโ”€โ”€ hari-equities
        โ”œโ”€โ”€ hari-crypto
        โ”œโ”€โ”€ hari-polymarket
        โ”œโ”€โ”€ hari-diagnosis (weekly bottleneck analysis)
        โ””โ”€โ”€ hari-lab (daily calibration & pattern learning)
    โ””โ”€โ”€ Desk Execution Layer
        โ”œโ”€โ”€ Quant (structuring + adapters)
        โ””โ”€โ”€ Risk (limits, circuit breaker, VaR)

Core principles:

  • Single source of truth โ€” paper-trades.json, daily memory files, MEMORY.md, and the brain. One reality, versioned.
  • No silent contradictions โ€” every conflict is logged under ## Friction and surfaced.
  • Every thesis above 55 conviction is formally predicted and later scored for calibration.
  • Model routing is deliberate โ€” Opus for synthesis, Sonnet for domain work, Grok for aggressive Red Team.

2. Daily Cron Pipeline (Post-April 15 Patch)

Before the April 15 fix, variable run times created chaotic ordering: Stage 1.5 arriving before Stage 1, Red Team after execution. The patch enforces strict staggering with 10โ€“15 minute buffers.

1

06:00 โ€” Stage 0: Morning Intelligence Brief

Regime, narrative stage, 48h catalysts, liquidity signals.

2

06:10 โ€” Stage 1: Domain Thesis Generation

Three parallel agents produce ThesisPackets with a mandatory exit_strategy object.

3

06:25 โ€” Stage 1.5: Cross-Domain Debate + Spearman Correlation Weighting

Penalizes concentration, rewards hedges.

4

06:40 โ€” Stage 2: Red Team

CP1 + CP3 adversarial review on both thesis and exit plan.

5

06:55 โ€” Stage 3: Quant + Risk + Veto-Gated Execution

Validates limits, auto-executes paper, proposes real-money moves with a 15โ€“30 min veto window.

6

07:10 โ€” Stage 4: Domain Reports

Feeds the next day's Stage 0.

Supporting jobs โ€” Aave health every 2h, EOD report at 20:00, hari-lab calibration, security audit โ€” now run cleanly around the main pipeline.

20-Day Spearman Correlation Matrix โ€” used in Stage 1.5 to penalize concentration and reward hedges
20-Day Spearman Correlation Matrix โ€” used in Stage 1.5 to penalize concentration and reward hedges

3. Deep Dive: Red Team Objection Taxonomy

The Red Team is the system's most important control. Every thesis โ‰ฅ55 post-debate receives a structured adversarial review. Objections are scored 0โ€“100; above 70 usually kills or heavily downgrades the thesis.

Objection categoryWeightWhat it catches
Anchoring Bias25%Over-attachment to prior narrative or price level
Data Discrepancies / Freshness20%Stale or conflicting sources
Adverse Scenario Underestimation20%Realistic tail risks not properly weighted
Pattern Flag Matching15%Matches known historical failures
Exit Strategy Weakness10%Flawed stop, time exit, or partial profit rules
Concentration / Correlation Risk10%Unhedged cluster risk missed in Stage 1.5

Canonical pattern flags tracked weekly in hari-lab: FIGHTING_THE_REGIME, EUPHORIA_ENTRY, NARRATIVE_LIFECYCLE_MISREAD, COMPOUND_PROBABILITY_BLINDNESS, IMPLEMENTATION_GAP, CROSS_DOMAIN_CORRELATION_BLINDNESS.

A real example: in March, a strong energy long was killed because it was anchored to pre-ceasefire assumptions. The objection proved correct โ€” energy sold off sharply after the truce. Every objection generates learning_tags that feed hari-lab. Pattern flag frequency is declining month-over-month. The average conviction haircut from Red Team has improved from -10.7 to -6.2 points since the debate stage was strengthened.

4. Deep Dive: Veto-Gated Execution Flow

All real-money or high-conviction paper moves follow this exact process:

1

Quant builds ExecutionProposal

Size, venue, slippage, partials, hedge.

2

Risk engine validates against live limits

VaR, exposure caps, circuit breakers.

3

Paper positions execute immediately

No human in the loop โ€” paper is for calibration.

4

Real-money proposals are sent to me

Full details and a 15โ€“30 minute veto window.

5

I reply VETO [ID], MODIFY [ID], or do nothing

Silence = approval. Asymmetric by design.

6

Approved trades execute via the proper adapter

leveraged-eth, dfns, aerodrome-swap, Polymarket.

7

Every outcome is logged to experiment-journal.json

Used for calibration scoring in the nightly extraction.

This flow prevented several bad trades in March and April. The April 15 Morpho deployment and ETH 50% scale-out both followed it precisely.

5. Performance Statistics & Attribution

Aggregate (Mar 29 โ€“ Apr 15): 68 closed paper trades, 57.4% win rate, profit factor 2.41, expectancy +0.91R, Sharpe 2.1, Sortino 2.8, Calmar 8.4, max drawdown 4.8%.

P&L attribution waterfall โ€” Gold Miners and EU Defense dominated; Energy residual was the biggest drag
P&L attribution waterfall โ€” Gold Miners and EU Defense dominated; Energy residual was the biggest drag

Domain Attribution (last 30 days)

Win rate by domain โ€” equities leading at 61%, crypto 54%, Polymarket 58%
Win rate by domain โ€” equities leading at 61%, crypto 54%, Polymarket 58%

  • Equities: 61% win rate โ€” strongest edge (gold & defense signals)
  • Crypto/DeFi: 54% win rate โ€” carry trade dominant
  • Polymarket: 58% win rate โ€” edge detection improving

Win-Rate Trend

Weekly win rate climbing from 48% โ†’ 55% โ†’ 61% over three weeks โ€” the system is learning
Weekly win rate climbing from 48% โ†’ 55% โ†’ 61% over three weeks โ€” the system is learning

Conviction Calibration

Conviction bucketRealized hit rateCountComment
70โ€“8568%31Slightly underconfident โ€” healthy
55โ€“6951%37Improving

Overall Brier score has improved for three straight months. The calibration curve below shows predicted conviction vs. actual hit rate โ€” tightly hugging the diagonal is the goal.

Conviction calibration curve โ€” predicted vs. actual hit rate across 30+ conviction buckets
Conviction calibration curve โ€” predicted vs. actual hit rate across 30+ conviction buckets

What changed

The system is clearly learning. Early energy trades repeatedly hit the FIGHTING_THE_REGIME flag. Post-lab updates, energy exposure is now tightly gated. Gold and defense signals currently show the highest edge.

6. Self-Improvement & Learning Loop

The real edge isn't a model. It's the explicit cognitive architecture around it:

  • Every thesis >55 conviction is formally predicted.
  • Nightly extraction scores outcomes and updates calibration.
  • REGRESSIONS.md contains hard rules that must never be repeated.
  • 14 canonical pattern flags are tracked weekly in hari-lab.
  • hari-diagnosis runs root-cause analysis every Monday.
  • Brain-first lookups are mandatory before any major decision.

Self-improvement flywheel โ€” prediction, execution, extraction, pattern flagging, regressions, and back to brain
Self-improvement flywheel โ€” prediction, execution, extraction, pattern flagging, regressions, and back to brain

This creates a compounding flywheel of epistemic quality. The frequency of repeated mistakes is visibly declining.

Conclusion

The Trading Desk does not pretend to predict the future. It positions probabilistically when fundamental, on-chain, prediction-market, macro-regime, and narrative signals rhyme.

With a strict cron DAG, aggressive Red Team review, veto-gated execution, and ruthless calibration, the system achieves something rare in AI: genuine intellectual humility at scale.

Markets will keep surprising us. But with a 57% win rate, 2.3:1 reward-to-risk, Sharpe 2.1, and improving calibration, the Desk is well-positioned to compound capital across regimes.

The future remains probabilistic. Position accordingly.

Ready to delegate?

Start free with your own AI team. No credit card required.

The Trading Desk: Cross-Domain AI Investment Orchestration (2026) โ€” Mr.Chief