← Weekly

2026-W18

Big 4 earnings, spirit, dinosaurs, Taiwan — plus Skills go public

2026-04-272026-05-03

BenmoreCattle LogicSkillsFDEStripeTempoAgentic CommerceAI ReplatformingSolopreneursStablecoin PaymentsNetwork EffectsSolow ParadoxGlobally-Native StartupsMETRTime HorizonsCapability DoublingBenchmark GapBaptist-and-BootleggerTaiwanChinaSilicon ShieldTSMC DependenceKMT vs DPPAvalanche DecouplingEconomic SanctionsDinosaursAlternative AssetsFossil MarketPre-Buy StrategyOdd LotsPershing SquareClosed-End FundsPermanent CapitalInsurance FloatConcentrated Investing

Texas ranch pilot, Foundry v0.4, 71 Claude Code skills published, and a deep podcast week: Stripe economy index, METR time horizons, dinosaurs as an asset class, Taiwan silicon shield, and Pershing Square PSUS IPO.

Read

Watched

Built

Shipped

Learned

Met

What actually shipped

The big one: every Claude Code skill we use daily at Benmore is now public at /skills with a /raw endpoint per skill so any LLM can pull the full SKILL.md as plain text. 71 skills across payments, compliance, design eng, AI SEO, CLI, frontend, and tooling.

Cattle Logic crossed a real threshold this week — first in-person ranch walkthrough in Amarillo. The onboarding flow is the kind of thing that only gets better when you watch someone with cracked, sun-cracked hands try to use it on a phone in 102°F heat. Three changes shipped the next day.

Reading

Andreessen's piece on agentic infra reframes "AI infrastructure" as agent substrate, not just GPU rental. The thing I keep coming back to: the unit of deployment is shifting from container to agent, and the orchestration primitives we have (k8s, Airflow) don't fit. New layer coming.


Stripe Sessions — AI Replatforming Global Trade

Watch on YouTube ↗ · John Collison & Emily Sans

Economic Dynamism and Business Formation

Solopreneurs scaling. There has been a significant surge in non-employer firm formations. AI lets solopreneurs achieve real scale — nearly 5 million Americans now earn their living this way, with a growing cohort crossing $100k in revenue. The class of 2026 is beating historical numbers.

Globally-native startups. Modern startups launch in multiple markets from day one. AI startups on Stripe often see the median firm earning most of its revenue internationally, selling into dozens of countries within their first year of operation. The old playbook of "go domestic first, expand later" is dead.

Agentic Commerce and Infrastructure

Levels of autonomy. Collison defines the progression from Level 1 (software filling forms / in-app checkout) to Level 3–5 (agents making fully independent purchasing decisions without human approval in the loop).

Machine-to-machine payments. The critical technical shift: autonomous payments. The keynote demos an agent using a stablecoin wallet via the Tempo CLI to execute micro-purchases for data — bypassing the latency and cost of fiat-based payment rails entirely. This is what "agentic commerce" looks like at the infrastructure layer: programmable money that settles in milliseconds with no interchange fee drag.

Orchestration. Tools like Stripe Projects and API-based integrations now handle deployment, API key management, and procurement autonomously. The grunt work is being automated away.

Value Complements in the Age of AI

The complement rule. When intelligence becomes abundant and cheap, its complements increase in value. Key complements Collison identifies:

  • Proprietary data — Data becomes a high-value training and reasoning asset. Companies are increasingly walling off data from AI crawlers to monetize it directly rather than giving it away.
  • Network effects — As AI simplifies discovery, marketplaces that aggregate buyers and sellers command higher take rates. The aggregator advantage compounds.
  • Real-world integration — John Deere is the example: enduring moats come from combining AI software with physical, hardware-heavy operations (GPS arrays, sensor networks, global physical assets). Software-only moats erode fast; physical-plus-software moats don't.

Productivity and the Solow Paradox

Drawing on the history of electrification (first commercial plant 1882, productivity dividend arrived in the 1920s) and early computing, Collison and Sans address the Solow Paradox — the phenomenon where technological breakthroughs show up everywhere except in official productivity statistics. Their argument: the lag exists because the economy must undergo structural replatforming and workflow redesign before the productivity dividends appear in measured output. We are still in the replatforming phase. The numbers will come.


Odd Lots — KMT, China, Taiwan, and Why a Land Invasion Can't Win Yet

Watch on YouTube ↗ · Guest: Eyck Freymann

The Silicon Shield and Economic Order

A blockade or conflict involving Taiwan isn't a regional security story — it's a catastrophic threat to the global economy. TSMC and high-end semiconductors for AI make a shutdown a "hard reset" of the post-1989 global economic order. The chips are a consequence of why Taiwan matters geopolitically, not the primary cause of the conflict risk.

Xi's Real Motivation

China's interest in Taiwan is primarily about political legitimacy and the unfinished business of the Chinese Civil War, not chip strategy. The party views national "rejuvenation" as incomplete without sovereignty over the island. Xi has staked his personal legacy on this — it predates and supersedes the semiconductor story.

Economic Shock Absorbers and the Ruble Trade

China has built significant stockpiles — oil, cotton, semiconductors — designed to outlast a Russia-style sanctions shock. The ruble playbook: Russia imposed capital controls, pushed interest rates to 20%, state enterprises paid in rubles, imports collapsed, exports surged on high oil prices, and the ruble recovered. China has studied every step.

Freymann's prescription: "avalanche decoupling" — gradual, strategic de-risking in specific critical sectors rather than attempting an impossible total decoupling overnight. The goal is to remove leverage before a crisis, not scramble during one.

Military Dynamics

China has numerically superior forces, but the US may hold a structural edge in naval warfare specifically. Unlike land attrition (where mass matters), sea warfare involves specialized platforms where early losses can cascade into structural collapse of capability. The Strait of Hormuz lesson: military capability alone is insufficient if political and economic systems can't sustain long-term disruption. Deterrence requires bipartisan consensus on economic resilience and tighter allied coordination.

Taiwan's Domestic Politics

  • DPP (Democratic Progressive Party) — emphasizes a distinct Taiwanese identity, strong US alignment
  • KMT (Kuomintang) — maintains traditional one-China framing, favors cautious diplomatic engagement with Beijing

The domestic split matters enormously for how Taiwan would respond under pressure. KMT alignment with the "one China" principle creates real tension with DPP governance when cross-strait relations deteriorate.


Odd Lots — METR: Measuring AI Autonomy with Time Horizon Charts

Watch on YouTube ↗ · Guests: Joel Becker & Chris Painter (METR)

The Core Metric

METR defines "time horizon" as the length of a task — measured by how long a skilled human takes to complete it — that an AI model can solve at a 50% success rate. They focus on complex autonomous engineering and ML research tasks to track the point at which AI could engage in recursive self-improvement or automate its own R&D process.

Why 50%?

Estimating high-reliability thresholds (e.g., 99%) requires vastly larger sample sizes and is highly sensitive to label noise. 50% is statistically stable and produces the clearest trend line. When you look at the 80% reliability curve instead, the progress appears less dramatic — which is the correct signal that many investment narratives are ignoring.

The Exponential

Capabilities have been doubling roughly every 4 months. Frontier lab compute R&D spending has been tracking this growth — the trend is heavily capital-supported. Given the infrastructure already committed for the coming years, significant deceleration is unlikely even if the industry slows at the model layer.

Common Misreading of the Charts

Most viewers assume the charts show a model working autonomously for a set duration (e.g., "can work for 12 hours"). Wrong. They measure task difficulty — how long it takes a skilled human to complete the equivalent work — not elapsed wall-clock time. Joel Becker specifically flags this as the most widespread misinterpretation.

Real-World Gaps

Benchmark success doesn't map cleanly to productivity. Real tasks involve large messy codebases, collaboration, human verification in non-deterministic workflows. The charts don't capture that friction — they measure capability ceiling, not deployment reality.

The Baptist and Bootlegger Dynamic

The same companies building cutting-edge models are the most vocal about existential risks. METR operates as a nonprofit specifically to maintain evaluator neutrality — though they depend on access to proprietary models from those same labs. Public awareness of capability trends is a prerequisite for effective governance, even if it distorts investor behavior as a side effect.


Odd Lots — Dinosaurs as an Alternative Asset Class

Watch on YouTube ↗ · Guest: Salomon Aaron (David Aaron Gallery)

Market Structure

High-value fossil discovery is labor-intensive, seasonal, and geologically constrained. The pricing paradigm shifted dramatically: the Stan T-Rex ($31M) and Apex Stegosaurus ($45M) sales reset expectations from historical low-millions to tens-of-millions. The market still lacks the historical pricing data that anchors traditional art valuation — comps are thin and recent.

Due Diligence

Because fossil legal status varies by country, the gallery focuses exclusively on American specimens found on private land where title is clear. Minimum documentation required:

  • GPS coordinates of the discovery site
  • Legal land deeds proving private ownership
  • Comprehensive photographic and video documentation of the piece in situ — proving it wasn't assembled from disparate sources after the fact

Bone Maps

The critical technical challenge: distinguishing real bone mass from resin or 3D-printed reconstruction. Professional paleontologists and conservators produce precise "bone maps" quantifying the percentage of original material in each specimen. Misrepresenting a composite specimen as majority-original is the primary fraud vector in this market. This is the equivalent of provenance authentication in the art market, but harder to fake because the science is deterministic.

The Pre-Buy Strategy

Because sovereign wealth funds and institutional buyers compete intensely for top specimens, sophisticated collectors now fund excavation and preparation before the piece is fully uncovered or assembled. You're betting on what's in the ground before you know how complete it is — high-risk, but the only way to access the best specimens before they go to auction.

Private Capital and Public Science

A key tension: private ownership vs. public scientific access. Aaron argues that the influx of private capital makes large-scale exploration economically feasible, which can benefit science when specimens are loaned or donated to museums. The Colchester Museum arrangement is cited as a model. Without private money, many of the most significant discoveries would stay in the ground indefinitely.

Investment Risks

High maintenance costs, regulatory risk (legal status could shift), low liquidity vs. traditional asset classes, and no historical return data. Aaron's position: primary driver should be historical interest, not financial speculation. It's a collectible first, an investment second.


Bill Ackman — Pershing Square PSUS: Closed-End Fund IPO

Watch on YouTube ↗ · Bill Ackman, CEO of Pershing Square

Structural and Strategic Innovation

Why a closed-end fund? PSUS adopts a closed-end fund legal structure not because it's a traditional fund, but because it is highly tax-efficient and flexible for their specific investment strategy. Ackman intends to operate it more like an investment holding company than a typical retail closed-end fund. Closed-end funds trade at a discount to NAV — the goal is to compress that discount over time through performance and communication.

No incentive fees. PSUS charges no incentive fees, making it the lowest-cost version of Pershing Square's offerings across all their vehicles. That's structurally different from the offshore funds which have historically charged performance fees.

Permanent capital compounding. Unlike traditional alternative asset managers that rely on constant fundraising — and shrink when they return capital — Ackman intends to grow PSUS primarily through long-term compounding of net asset value (NAV). The capital is permanent: no redemptions, no rolling vintage problem.

Investment Philosophy and Market Outlook

Current positioning. Bullish: some of the highest-quality businesses globally are trading at historically low multiples. The firm focuses on concentrated, large-cap, liquid assets — a deliberately narrow mandate.

The Berkshire insurance float model. Ackman draws a direct parallel to Berkshire Hathaway's structure. Vantage Holdings (an insurer) will have its assets managed in a manner similar to how Buffett uses insurance float to compound value — patient, long-duration capital deployed into equities at attractive prices.

SEC transparency advantage. The listed, SEC-registered status of PSUS allows for more transparent investor communication than the restrictions placed on their previous offshore, private funds. Ackman has consistently argued that transparency is itself a competitive advantage in the long-duration compounding game.

Track Record

  • 19% return on equity over the last 22 years
  • 24.9% return over the 8 years since adopting permanent capital

What's next week

Foundry v0.5 — multi-tenant audit log emission, and W19 lands Sunday with the ranch-walkthrough learnings written up properly.

Changelog · live

  1. Modal UX: clicking a watched item now opens a scrollable detail modal (react-markdown, ESC/X/backdrop to close) instead of jumping to a page anchor.

  2. Backfilled Thursday (Apr 30) deep-dive session across all 5 podcasts in learned rail.

  3. Added Pershing Square PSUS entry. Expanded tags to 35. Full prose notes in body.

  4. Added 4 podcast entries (Stripe, Taiwan, METR, Dinosaurs) with inline notes and anchor-linked detail sections.

  5. Added explicit image overrides for the Karpathy thumb and Latent Space logo.

  6. Initial publish — rails, prose, and home-page surfacing.