The pipeline, in one paragraph
Each trading morning, a five-stage pipeline ingests 30 days of news and four quarters of earnings transcripts for held positions, watchlist names, and the top-100 momentum universe. A frontier language model synthesises a strategic grade per candidate, then reasons across all candidates side-by-side under sizing and risk constraints. The output is a dossier per ticker, a regime-tagged journal entry, a macro view, sector rotation, and a curated watchlist. The reasoning is what gets published.
The five stages
- Liquidity screen. ~400 US-listed names filtered by spread, ADV, market cap, and tradable shape. Deterministic; cheap; eliminates ~75% of the universe before the LLM touches it.
- Momentum + narrative scoring. Survivors ranked by a composite of price structure × volume × news density × theme-cluster strength. Themes are first-class — the system reads narrative basket behaviour, not isolated ticker action.
- Strategic-grade assessment. Claude Opus reads news, earnings transcripts, and filings for each candidate and emits a strategic grade (
FORTRESS/STRATEGIC/MARGINAL/NONE), along with the dossier you see on this site: thesis, invalidation, bull case, bear case, setup, catalysts, what-would-change-our-mind, correlations. - Portfolio composition. Opus reasons across all candidates side-by-side using the 1M-token context — every dossier in one pass — then emits a target composition under sizing caps, archetype rules, regime tiers, and concentration constraints. Not published.
- Postmortem + adaptive optimiser. Closed trades trigger a postmortem call. Recurring patterns promote to a playbook. Rule weights Bayesian-update from 90-day outcomes weekly. The optimiser's notes inform the methodology — the underlying portfolio mechanics stay private.
NY-time dispatcher
Trading windows are computed off the Alpaca market calendar and dispatched in NY-relative minutes — DST-correct year-round, including the ~4-week US/EU DST gap windows where naive Berlin-time schedules silently fail. Daily windows:
- Premarket — universe scan + overnight-gap check on held names.
- Decision window — 20 min pre-close, the main entry/exit pass.
- Midday + opportunistic — 11:30 + 14:00 ET, fire only on material new info.
- Post-close — fill ledger reconcile, EOD journal, regime confirmation.
What's in a dossier
Each dossier follows the same skeleton — designed so a reader can audit the reasoning, not just the conclusion:
- Current thesis — one paragraph, what the bet is.
- Invalidation trigger — the explicit kill criterion.
- Bull case — sourced bullets, dated, no hand-waving.
- Bear case — same construction, equal weight.
- Setup & price structure — MAs, RSI, levels, basing pattern.
- Catalyst calendar — next 30 days, dated.
- What would change our mind — explicit conditions for higher / lower conviction.
- Correlation notes — how the name moves with its basket.
Archetype taxonomy
Every name is tagged with an archetype that drives sizing discipline. Archetypes are not labels — they're behaviour profiles that the rule engine and the sizing model both read. Full definitions on the glossary.
- a1 — Compounder. Quality balance sheet, secular tailwind, multi-year hold candidate.
- a2 — Cyclical recovery. Mean-reverting earnings, regime-sensitive.
- a3 — Theme leader. Highest-conviction name within an active narrative.
- a4 — Special situation. M&A, spin-off, restructuring, regulatory event.
- a5 — Earnings inflection. Pre/post-print setup with explicit binary.
- a6 — Retail squeeze. High-beta, short-interest-driven, hard sizing cap.
- a7 — Defensive. Cash-flow durability, low-beta, regime hedge.
- a8 — Macro hedge. Cross-asset proxy for thematic risk (XLE / GLD / TLT / …).
Regime classification
Each journal entry records the system's regime call. Regimes are not picks — they're a filter that gates how aggressively the system reads candidate setups. The macro view is the long-form version of the same read.
Published regime labels: RISK-ON, CHOPPY, RISK-OFF,
STAGFLATION_FEAR_BINARY, STAGFLATION_FEAR_ESCALATING,
HEALTHY_MARGINAL, and variants. Each is a defined transition function that maps to
buy-threshold, size-multiplier, max-exposure, and cash-floor settings.
Conviction levels
Conviction is the model's calibrated confidence that the setup will play, not a price target or
return forecast. Four levels: SUPREME, HIGH, MEDIUM,
LOW. Each calibrates sizing and stop discipline — and each carries an explicit
invalidation trigger that strips the conviction if breached.
How outcomes are scored
Open methodology applies to the scoring too — here is the exact, reproducible method behind the
track record. Every thesis ships a falsifiable invalidation trigger.
When it resolves, the pipeline marks it played_out or invalidated, dated,
with a flag for whether the published trigger fired first. Strictly non-monetary — outcomes are
binary (the claim held, or it was falsified); there are no returns, no P&L, no positions anywhere
in the scoring.
Each conviction tier is treated as a probabilistic claim, published in advance, that the thesis
plays out: SUPREME = 0.90, HIGH = 0.75, MEDIUM = 0.60,
LOW = 0.50. Names the model held no conviction on are listed in the resolved ledger for
transparency but are not scored — you can only be graded on a call you actually made.
Against the binary outcome (played-out = 1, invalidated = 0) we compute:
- Brier score — the mean squared error between the stated probability and the outcome. 0 is perfect, 0.25 is a coin flip, 1 is maximally wrong. (Reference: expert Superforecasters ≈ 0.08, the best LLMs ≈ 0.10.)
- Murphy decomposition — Brier = calibration − resolution + uncertainty. Calibration (reliability) is how far each tier's observed play-out rate sits from its stated probability; resolution is how much the tiers separate from the base rate; uncertainty is the irreducible base-rate variance.
- Brier skill score — skill versus a naive always-the-base-rate forecast (1 − Brier ∕ uncertainty).
- Reliability by tier, archetype, and regime — the observed play-out rate broken out by conviction tier, by archetype, and by the macro regime in force when each thesis resolved. This is the falsifiable test of whether
SUPREMEactually beatsLOW.
The board is hidden until at least one thesis has resolved (no faked scorecard), and the full
record is machine-readable at /track-record.json for
agents and independent verification.
Where the model is wrong
Three classes of failure are recurring and worth naming:
- Stale facts. A model snapshot of a balance sheet can lag the latest 10-Q. Flagged in dossier notes when caught — not always caught.
- Confident-but-wrong setup reads. A "clean higher-low" can become a failed reclaim within hours. Dossiers age fast; recency-of-write is on every page.
- Theme misclassification. A name gets bucketed in a basket whose actual price driver is different; the correlation logic then over-fits.
This is why nothing here is a recommendation. The dossier is the reasoning, not the trade.
What we deliberately keep private
Research transparency is not portfolio transparency. The system runs on a paper account; even so, sharing positions in real time would let readers front-run or fade the orders, and would compromise any forward-test. Published surface includes:
- Thesis, invalidation, catalyst, themes, archetype — yes.
- Regime calls, macro reads, sector rotation — yes.
- Watchlist names and the reason they're on the list — yes.
- Specific entry prices, share counts, stops, targets, portfolio value, return %, P&L — no.
Common questions
What is orbyd?
orbyd is a continuous market-intelligence layer built on frontier language models. A multi-stage pipeline reads US equity news, earnings transcripts, and price structure every trading day, then publishes per-ticker dossiers, regime-tagged journal entries, a weekly macro view, sector rotation, and a curated watchlist.
Which language models power the pipeline?
Anthropic's Claude Opus and Claude Sonnet. The portfolio composition stage uses Opus's 1M-token context window to compare hundreds of candidates side-by-side in a single pass.
Why aren't positions, fills, or P&L shared?
The pipeline runs against a paper account; publishing live actions would let readers front-run or fade the orders and would compromise a clean forward-test. The read is public, the trade is not — by design.
What is an archetype?
Archetypes are behaviour profiles assigned to each name (a1 Compounder, a2 Cyclical recovery, a3 Theme leader, a4 Special situation, a5 Earnings inflection, a6 Retail squeeze, a7 Defensive, a8 Macro hedge). They drive sizing discipline and stop logic. See the glossary for full definitions.
What is an invalidation trigger?
The explicit kill criterion published with every dossier. If the trigger fires, the conviction is stripped and the thesis is treated as broken. It's a published commitment, not a soft warning.
How often is the site updated?
Daily for the journal and dossiers; weekly for the macro view and sector rotation. Each surface carries its own dateModified meta and an updated-at line. Subscribe via JSON Feed or RSS.
Is this investment advice?
No. orbyd publishes educational content under the BaFin and EU regulatory framework. No personalised advice is given and no orders are accepted.