Skip to content
Morning Briefing · Tuesday, April 7, 2026

Anthropic Signs 3.5 GW TPU Deal With Google and Broadcom — $30B Run Rate Revealed

networkingautomationai-mldatacentersciencesecurity
Listen to the episode
HFT Meets AI
16 min · 85 turns
Plate Ileaf · spine
Schematic leaf-spine fabric — explicit-path traffic flows across the spine plane, pods at the edges.

Amaze Networks Morning Briefing — Tuesday, April 7, 2026


Top Highlights
№ 01·Top Highlights

Top 3 Highlights

1. Anthropic Signs 3.5 GW TPU Deal With Google and Broadcom — $30B Run Rate Revealed

TL;DR: Anthropic has secured 3.5 gigawatts of Google TPU compute capacity via Broadcom, starting 2027, while announcing its revenue run rate has crossed $30 billion — tripling from $9B at end of 2025.

Key Points:

  • Deal gives Anthropic access to next-gen Google TPU accelerators built by Broadcom, adding to 1 GW already contracted for 2026
  • Enterprise customers spending >$1M/year more than doubled in under two months: 500 in February to 1,000+ today
  • Infrastructure will be predominantly US-based, tied to Anthropic's $50B American AI compute pledge from November 2025
  • Broadcom is also building new datacenter networking chips for Google in the same agreement
  • This is a signal that custom silicon (TPUs) is increasingly viable against NVIDIA at Anthropic's scale

Deep Dive:

The Anthropic-Google-Broadcom announcement reads, on the surface, like a vendor deal. It's actually a statement about the structure of the AI infrastructure market. Anthropic is locking in 3.5 gigawatts of TPU capacity years in advance because compute availability — not model quality — is now the primary constraint on AI revenue. The doubling of million-dollar enterprise accounts in eight weeks tells you adoption is genuinely accelerating, not just being announced.

The networking angle is often underreported: Broadcom is building new AI datacenter networking chips for Google in this same deal. These are purpose-built interconnects for the TPU Pod infrastructure that moves training gradients and inference tokens between chips. This is the Ethernet-for-AI story continuing at hyperscale — custom silicon for switching and routing inside the datacenter, designed around the actual traffic patterns of transformer-based workloads, not traditional ECMP assumptions.

For infrastructure engineers, the signal is this: the compute wars are increasingly decided by power contracts and custom silicon supply chains, not benchmark scores. That shifts where the interesting engineering problems live — from model architecture to fabric design, power delivery, and efficient inference serving. The demand side is now predictable enough that hyperscalers are signing 3-5 year infrastructure commitments. That's a different kind of industry than the one that existed two years ago.

So What? The AI infrastructure market has crossed a threshold where custom silicon and multi-gigawatt compute deals are standard operating procedure — if you're speccing AI fabric, start asking vendors about their TPU interconnect roadmaps, not just their GPU specs. The Broadcom networking chip work is the part to watch.

SourcesThe Register, The Next Web, Silicon Republic


2. AI Datacenters Are Rediscovering What HFT Figured Out 20 Years Ago

TL;DR: A detailed analysis from Data Center Knowledge draws a direct line between high-frequency trading infrastructure techniques — deterministic networking, kernel bypass, precision timing — and what AI inference workloads actually need.

Key Points:

  • Real-time inference requires microsecond-scale responsiveness, deterministic packet delivery, and no jitter — identical to HFT requirements
  • Techniques like DPDK (kernel bypass), Precision Time Protocol synchronization, and lossless fabric design originated in trading infrastructure
  • AI all-reduce operations during training create synchronized burst traffic that destroys conventional network buffers — same problem HFT solved with co-location and low-latency switching
  • GPU all-reduce synchronization requires every node to finish before the next iteration, making tail latency the critical metric — not average throughput
  • PFC/ECN tuning borrowed from trading infrastructure is now standard in RoCEv2 AI fabrics

Deep Dive:

The framing here — AI datacenters learning from HFT — is useful because it recontextualizes what "network performance" means for AI workloads. In traditional enterprise networking, latency in the single-digit milliseconds is fine. In HFT, nanoseconds matter. AI training sits somewhere in between, but the principles are identical: you cannot afford jitter, you cannot afford unpredictable buffer behavior, and you need every node synchronized to a common time reference.

Kernel bypass via DPDK is the most directly applicable technique. By taking the operating system out of the network I/O path entirely, you eliminate thousands of nanoseconds of software overhead. Trading systems have done this for two decades. The same approach is now appearing in AI inference serving, particularly for high-throughput token generation where the bottleneck is memory bandwidth on the interconnect, not raw compute.

The deeper lesson is architectural: deterministic networking is a design philosophy, not a feature flag. HFT built entire switching fabrics around it — custom ASICs, cut-through forwarding, priority queuing tuned to microsecond budgets. AI datacenters are arriving at the same place from a different direction. The convergence is happening, and infrastructure engineers who understand both domains are becoming genuinely rare and valuable.

So What? If you're designing AI inference fabric, read the HFT networking literature — seriously. Priority Flow Control, ECN marking thresholds, and precision timing are the same problems with different workload names. The techniques exist; the practitioners who know them are just in different industries.

SourcesData Center Knowledge


3. Itential FlowAI Brings Agentic Orchestration to Network Operations — With Governance Baked In

TL;DR: Itential's FlowAI bridges natural language intent and network automation workflows, routing all agentic actions through its enterprise control plane with full authentication, authorization, and audit trails.

Key Points:

  • FlowAI translates natural language intent into governed workflows executing across network, cloud, ITSM, and security systems
  • All agentic activity runs through Itential's control plane — authentication, authorization, and audit are non-negotiable, not bolt-on
  • Core components: Itential Platform, Automation Gateway (IAG), and an MCP Server for integration with external AI tools
  • Currently in private customer preview — limited enterprise access before GA
  • 451 Research independently validated the architecture as suitable for governed, autonomous infrastructure operations

Deep Dive:

The governance story is the most important part of FlowAI. Every agentic automation play this cycle faces the same objection from network teams: "I can't give an AI agent write access to production." Itential's answer is to make the control plane the mandatory choke point — the AI proposes, the platform authorizes based on role and policy, the action executes, and the audit trail is automatic. This is architecturally the right answer.

The MCP integration is clever. Rather than building a proprietary integration layer, Itential exposed an MCP server — which means any MCP-compatible AI client (Claude, GPT-based agents, etc.) can talk to the Itential Platform without custom integration. This is the same pattern we've been tracking since the MCP Dev Summit two weeks ago: MCP is becoming the enterprise integration bus for agentic systems, including network operations tooling.

The practical limitation right now is the private preview stage. The architecture is sound but the implementation is nascent. Organizations that want to experiment should get on the preview list now — Itential has historically had solid enterprise deployments and a realistic understanding of network operations constraints. This isn't a startup learning the domain from scratch.

So What? If your automation roadmap includes AI-assisted change management, FlowAI's governance model is the template to study. Request a preview slot now — the organizations that figure out governed agentic automation in 2026 will have a real operational advantage by 2027.

SourcesItential FlowAI, PR Newswire


Automation
№ 02·Automation

Network Automation

Plate IIautomation
Source-of-truth pipeline — intent → diff → apply → verify, idempotent on every revolution.

Cisco Intersight Ansible Collection Expands Tenfold — 100+ Modules for Full Lifecycle Management

Red Hat's Q1 network automation recap confirms the Cisco Intersight Ansible collection grew from roughly 10 modules to over 100, covering Day 0 server provisioning through Day 2 operations including firmware upgrades and port configuration. This is the kind of boring-but-critical expansion that makes Ansible a viable full-lifecycle tool rather than a configuration-push utility. The collection now covers enough of the Intersight API surface that you can genuinely manage UCS infrastructure without touching the GUI.

So What? If you're running Cisco UCS/Intersight and haven't updated your Ansible collection in the last quarter, do it now. The Day 2 modules especially — firmware and port config — are the ones that were forcing people to context-switch out of their automation pipelines.

Event-Driven Infrastructure Is Outpacing GitOps at Fast-Moving Shops

Multiple signals this week confirm the pattern we covered March 30: the fastest-moving infrastructure teams are moving beyond pure GitOps toward event-driven models where git remains the source of truth but automation triggers from operational events, not just CI/CD pushes. The pattern — BGP event fires → Nornir task executes → Batfish validates → NetBox sync — is maturing from blog post to production template.

The nuance worth noting: this isn't GitOps dying. It's GitOps being augmented. The git repository still holds the intended state. The event bus handles the triggers. Batfish handles the pre-flight validation. For teams that haven't built this yet, GitOps with Batfish validation is still the right first step — you can add event-driven triggers on top later.

So What? Map your current automation triggers. If every change still starts with a git commit, identify the top three operational events that should trigger autonomous remediation — and design the guardrails before the automation.

GitOps Adoption at 64% — AI-Assisted Workflows Emerging

Surveys are showing 64% GitOps adoption with 81% of adopters reporting higher reliability and faster rollbacks. More interesting: tooling is emerging that lets AI systems propose infrastructure changes, summarize diffs, and provision resources from natural language while still enforcing policy and audit trails. This is the FlowAI pattern at a more general level.


AI / ML
№ 03·AI / ML

AI / ML

Plate IIIai / ml
Embedding space — clusters carry related concepts; the highlighted query vector pulls its nearest neighbors.

Meta's AI Tribal Knowledge System: 50 Agents, 59 Context Files, 40% Fewer Agent Tool Calls

Meta published a detailed engineering post on how they built a system to encode tribal knowledge — design decisions, non-obvious patterns, inter-repo dependencies — that previously existed only in engineers' heads. The approach: 50+ specialized AI agents systematically read every file across a four-repository, 4,100-file codebase and produced 59 concise context files. Coverage went from 5% to 100% of code modules. Preliminary results show 40% fewer AI agent tool calls per task.

Three design decisions stand out: files are kept under 1,000 tokens (not encyclopedic summaries), they're opt-in and loaded only when relevant, and they go through multi-round quality review plus automated maintenance. The system re-validates itself every few weeks, detecting coverage gaps and auto-fixing stale references.

So What? This is directly applicable to network automation codebases. If you have Nornir tasks or Ansible playbooks that require tribal knowledge to use correctly, a lightweight version of this approach — concise context files per role/task, opt-in loading — would materially improve how AI tools interact with your automation code.

LLM Infrastructure Deals Accelerating — Meta Signs $100B AMD Agreement

Meta has signed a multiyear agreement to purchase up to $100 billion worth of AMD chips (MI540 GPUs and CPUs) to diversify AI infrastructure and reduce NVIDIA dependency. Combined with the Anthropic-Google-Broadcom deal, this week is a reminder that the AI infrastructure market is being defined by multi-year compute contracts at scales that dwarf traditional IT procurement cycles.

Gartner: Explainable AI Will Drive 50% of LLM Observability Investment by 2028

A new Gartner prediction finds explainable AI tools will represent 50% of LLM observability investment by 2028, up from 15% today. This is the accountability layer the enterprise has been asking for — not just "did the model answer correctly" but "why did it answer that way and what data did it use." For infrastructure teams building AI-integrated automation, this is the governance signal: start building observability into your AI-assisted workflows now, not as an afterthought.


Datacenter
№ 04·Datacenter

Datacenter

Plate IVdatacenter
Datacenter row — per-rack utilization at a glance. Cool colors are slack; warmer fills are pressure.

Vertiv MegaMod HDX: Prefabricated AI Racks Solving the Speed-to-Capacity Problem

Vertiv's MegaMod HDX (launched January 2026) is the clearest example yet of how prefabricated modular infrastructure is solving the AI deployment speed problem. The combo model supports up to 144 racks and 10 MW, with direct-to-chip liquid cooling and rack densities from 50 kW to over 100 kW. The key architectural shift: instead of building custom infrastructure per deployment, operators are buying pre-validated power + cooling modules and assembling them on site.

Vertiv also announced SmartRun, an overhead prefabricated infrastructure system integrating power distribution busbar, liquid cooling piping, hot-aisle containment, and network infrastructure in a single deployable module. The design philosophy — make repeatability the default — is the same one data centers should be applying to their automation stacks.

So What? If you're involved in AI datacenter procurement, the modular vs. custom infrastructure decision is now a speed-to-deployment calculation, not a cost calculation. Pre-validated modules from Vertiv or similar vendors are 30-50% faster to bring online. Factor that against the compute deadlines your organization is working toward.

Indianapolis Datacenter Protest Turns Violent — Community Opposition Is Real Infrastructure Risk

The Register reports gunshots were fired at the home of an Indianapolis city councilor who supported a planned data center development. This is an extreme example of a real trend: community opposition to datacenter builds is intensifying as residents experience power infrastructure strain, water consumption concerns, and land use conflicts. It's worth flagging for anyone involved in site selection — NIMBY risk is now a legitimate infrastructure planning factor, not just a PR concern.

LY Corp's OpenStack Consolidation: 164 Clusters Into One — A Case Study in Technical Debt

LY Corporation (Yahoo Japan + LINE parent) is collapsing 164 OpenStack clusters and 160,000+ VMs into a single unified cloud called "Flava." The driver: years of heavy customization to OpenStack made upgrades nearly impossible and security patching a multi-month project. The lesson is architectural: heavy forking of open-source infrastructure platforms creates compounding debt. The new Flava architecture stays aligned with upstream OpenStack, minimizes custom patches, and enables continuous updates.

So What? Audit your own customization debt in open-source platforms. Every custom patch you're carrying against SONiC, OpenStack, or Kubernetes is a future migration cost. Itching to customize? Upstream it instead.


Science
№ 05·Science

Science

Plate Vscience
Field schematic — three-body stability under quasi-equal masses, drawn from the day's central result.

Oratomic Launches With Research Suggesting Fault-Tolerant Quantum at 10,000 Qubits Is Achievable

Oratomic, a neutral-atom quantum computing startup founded by researchers from Caltech, Harvard, Google, and Amazon, launched March 31 with a striking claim: fault-tolerant quantum computation capable of running Shor's algorithm (which breaks RSA encryption) may be achievable with roughly 10,000 reconfigurable atomic qubits — far fewer than most prior estimates suggested.

The technical basis is a new approach to quantum error correction using high-rate codes combined with the reconfigurability of neutral-atom arrays. Manuel Endres (Caltech, Oratomic co-founder) has already demonstrated trapping 6,000 atomic qubits, and the architecture allows dynamic rearrangement during computation for more efficient error correction. Separately, Google and Oratomic each published independent analyses suggesting quantum computers could crack ubiquitous encryption keys before 2030.

Why It Matters: This isn't just a cool physics result. If 10,000 qubits is the actual threshold for cryptographically relevant quantum computing — and neutral-atom platforms can scale to that level — the "quantum threat is decades away" planning assumption needs revision. Network security architects who've been treating post-quantum cryptography as a future problem should recalibrate.

So What? If your organization hasn't started a post-quantum cryptography inventory, start now. NIST's PQC standards are finalized. The question is no longer "if" but "when you'll need to migrate."

SourcesThe Quantum Insider, HPCwire


Security
№ 06·Security

Security Architecture

Plate VIsecurity
Zero-trust egress — credentials are injected at the proxy boundary, never reaching the client runtime.

Zero Trust Market Crossing Into Mainstream — 60% of Enterprises Now Using Multiple Segmentation Forms

Forrester is calling 2026 the "Golden Age of Microsegmentation," with 60% of enterprises pursuing zero trust now using more than one form of microsegmentation (up from less than 5% in 2023). The technology has crossed from early adoption into mainstream deployment across healthcare, manufacturing, financial services, and government.

The most notable architectural development: modern microsegmentation platforms now feature AI-driven predictive policy suggestions and self-healing segments that isolate threats automatically. Illumio Insights (February 2026) added agentless east-west visibility by pulling firewall telemetry from Check Point and Fortinet — covering the "visibility before enforcement" gap that blocked many deployments.

So What? If your microsegmentation deployment stalled at the visibility phase, the 2026 platforms (particularly Illumio Insights) have removed the primary excuse. Agentless visibility means you can map east-west traffic without touching endpoints. There's no longer a good reason to wait.


Quick Takes
№ 07·Quick Takes

Quick Takes

  • Cloudflare Organizations Beta — Cloudflare launched granular organizational management for enterprise-scale multi-team deployments. Relevant for large teams running complex Cloudflare configurations. Architecture implications for ZTNA and DDoS policy management.

  • OpenClaw subscription lockout — Anthropic has blocked subscription-tier use of OpenClaw (the open-source Claude agent tool) due to demand overload. The tool has 335K GitHub stars and 135K exposed instances — reflecting the scale of adoption and the challenge of pricing agentic workloads.

  • AIOps adoption jumps 12 points — Enterprises moved from 42% to 54% AI-powered monitoring adoption in a single year (2025→2026), driven by telemetry overload at scale. The bottleneck is now skills: 63% of organizations report shortage of professionals who can interpret ML outputs in operational contexts.

  • Cisco Catalyst Center Global Manager tips — Cisco published five practical tips for multi-site operations management with Catalyst Center Global Manager. Worth reviewing for shops running distributed on-premises deployments at scale.

  • GPUBreach attack via GPU Rowhammer — Researchers demonstrated privilege escalation via bit-flip attacks on GDDR6 GPU memory. Architecturally interesting: it extends the Rowhammer vulnerability class to GPU memory, which affects AI infrastructure with untrusted workload co-location. File under "GPU isolation is not just a software problem."


Watch Today
№ 08·Watch Today

Watch Today

  • Oratomic/Caltech quantum encryption timeline papers — The independent Google and Oratomic analyses on RSA vulnerability timelines are worth reading in full. If they hold up under peer review, the post-quantum migration conversation changes significantly.
  • Itential FlowAI private preview — If you're evaluating AI-assisted network automation, get on the preview list before GA pricing is set.
  • Meta's tribal knowledge system paper — The arxiv preprint (2602.13521) has practical applicability to any engineering team with complex automation codebases.
  • Anthropic's revenue trajectory — The doubling of million-dollar enterprise accounts in eight weeks is worth tracking. It's the clearest data point on enterprise AI adoption velocity in the market.

Automation
№ 09·Automation

Pipeline Stats

Plate VIIautomation
Source-of-truth pipeline — intent → diff → apply → verify, idempotent on every revolution.
  • Domains covered: Networking, Automation, AI/ML, Datacenter, Science, Security
  • RSS digest articles used: 5 high-signal articles (score ≥ 3.0)
  • Web searches: 9
  • Dedup rejections: 0 (all stories are beyond the 72-hour cooldown window — last run was April 3)
  • Quality score: 4/5
  • Lead story: Anthropic 3.5 GW TPU deal / AI infrastructure market structure
  • Fun one: AI datacenter learning from HFT — surprising cross-domain insight
Subscribe

Get the briefing in your inbox.

One email per weekday morning. Same writing, same sources — no audio required.