Livermore-Style Tactical Allocation for Agent Fleet Inference Spend

1. Overview

Jesse Livermore's early-20th-century tactical playbook — pivot points, concentration in market leaders, pyramiding into proven positions, and disciplined stop-losses — was developed for equity trading but maps with surprising precision onto a problem facing autonomous agent fleets in 2025-26: how to allocate a finite inference and research budget across a rapidly-shifting menu of LLM providers, embedding APIs, search vendors, and compute markets. The frontier model market behaves like a momentum-driven equity tape — capability leadership rotates (GPT-4 → Claude 3.5 → GPT-4o → Claude 3.5 Sonnet new → o1 → Claude 3.7 → GPT-4.5 → o3 → Gemini 2.5), pricing resets are discontinuous, and quality-per-dollar curves move on weekly timescales. This note translates Livermore's four core disciplines into concrete agent-fleet allocation rules, then argues that Empirica's structured research notes function as the agent-economy equivalent of Livermore's tape-reading apparatus: the signal source that triggers rotation.

2. Key Findings

  • Frontier model price cuts are step-functions, not glides. OpenAI's GPT-4 input price fell from $30/M tokens (March 2023) to $2.50/M (GPT-4o, May 2024) to $0.15/M (GPT-4o-mini) — roughly a 200× compression in 18 months (OpenAI pricing — https://openai.com/api/pricing/). Anthropic's Claude 3.5 Haiku launched at $0.80/M input, Claude 3.5 Sonnet at $3/M (Anthropic pricing — https://www.anthropic.com/pricing). Google's Gemini 1.5 Flash sits at $0.075/M (Google AI pricing — https://ai.google.dev/pricing). These are pivot-point-like discontinuities: prices stay flat, then jump.
  • Capability leadership rotates on a 60–120 day cadence. Public leaderboards (LMSYS Chatbot Arena — https://lmarena.ai, Artificial Analysis — https://artificialanalysis.ai) show the #1 slot changing hands repeatedly across 2024-25 between OpenAI, Anthropic, and Google, with xAI Grok and DeepSeek entering the top tier in late 2024 / early 2025. [EMPIRICA ANALYSIS] The half-life of "best model on this benchmark" is roughly one fiscal quarter.
  • Cost variance across providers for equivalent quality is currently 10–50×. DeepSeek-V3 at ~$0.27/M input vs Claude 3.5 Sonnet at $3/M is an 11× gap; o1-preview at $15/M input vs Gemini Flash at $0.075/M is a 200× gap (DeepSeek pricing — https://api-docs.deepseek.com/quick_start/pricing). Quality-adjusted, the spread narrows but remains material.
  • Routing techniques exist but are decision-quality bound. Approaches like RouteLLM, FrugalGPT, and Martian's model router demonstrate 40–85% cost reduction at matched quality on benchmark tasks, but their effectiveness depends on accurate, current capability assessments — which decay quickly as new models ship.
  • Stop-loss equivalents are underused. Most agent fleets do not have automated rotation triggers; observation of customer infrastructure suggests provider lock-in via prompt-engineering, fine-tunes, and feature-coupling (structured outputs, prompt caching, vision) creates switching costs that violate Livermore's "cut losses fast" principle. [SPECULATIVE]
  • Pyramiding into proven providers is rational but rarely formalised. When a provider demonstrates sustained quality+price leadership, scaling allocation toward them captures compounding cost-efficiency and discount tiers (enterprise contracts, committed-spend discounts on AWS Bedrock and Azure OpenAI). [EMPIRICA ANALYSIS]

3. Agent Service Patterns — Livermore Translated

3.1 Pivot Points → Provider Regime Changes

Livermore defined a pivot point as the price level at which a stock's character changed — a breakout or breakdown that signalled a new trend phase. The agent-economy analogue is a provider regime change: a model release, price cut, latency improvement, or context-window expansion that re-ranks the cost/quality frontier.