Morning Briefing — Wednesday, May 13, 2026

№ 01·Top Highlights

Top 3 Highlights

1. OpsMill Infrahub Closes $14M Series A — Source of Truth Gets an MCP Interface

TL;DR: OpsMill raised $14 million to accelerate Infrahub, its graph-database-backed infrastructure platform, and is now shipping an MCP server that lets AI agents query validated network state directly — without custom retrieval glue or hallucinated topology.

Key Points:

Iris Capital led the round; Benhamou Global Ventures, Serena Capital, and Partech Partners participated
Infrahub models hundreds of thousands of infrastructure elements as a graph — tracking relationships between devices, interfaces, policies, VRFs, and services, not just flat asset records
The new MCP (Model Context Protocol) server exposes infrastructure data to any MCP-compatible AI agent, meaning Claude, Cursor, or any agentic tool can get deterministic, relationship-aware answers about network topology without a custom API shim
Eurofiber reduced service deployment time from five days to fifteen minutes post-Infrahub adoption
Core thesis: AI agent hallucinations are a data-quality problem before they're a model problem — agents operating on graph-structured validated data produce structurally different outputs than agents operating on generic context

Deep Dive:

The source-of-truth landscape has spent two decades trying to solve a human problem: where does the single record of network truth live? NetBox 4.6 (covered Monday) moved toward approval workflows and branching — good answers for human change management. Infrahub is solving a different problem: what does the data layer look like when the primary consumer is an AI agent?

Graph databases are the right answer for that specific problem. A relational schema can tell you what interfaces exist. A graph can tell you that interface Ethernet1/1 on Leaf-01 carries the BGP session to Spine-03, which is in the path for the VRF used by tenant-A's production application, which has a compliance requirement mapping to policy-set-12. An AI agent asked "is it safe to bounce Ethernet1/1 on Leaf-01?" needs that traversal. A flat IPAM database cannot provide it.

The Eurofiber result — five days to fifteen minutes — is the clearest signal that the data model problem is not theoretical.

The MCP integration is what closes the loop for 2026. Rather than building one-off integrations for each model or agent framework, Infrahub exposes infrastructure state through the standard interface that AI agents already know how to use. For automation engineers, this means your source of truth can be a first-class participant in an agentic workflow — not just a read target that someone queries with custom scripts.

So What? Before committing to any AI-assisted NetOps evaluation this year, check whether the platform's data layer exposes an MCP interface — if it doesn't, you're either building custom retrieval glue yourself or betting on an LLM's probabilistic memory of your topology.

SourcesSiliconAngle

2. NVIDIA Dynamo — Disaggregated Inference Orchestration at the Production Planning Horizon

TL;DR: Released as open source at GTC in March, NVIDIA Dynamo separates inference prefill and decode into independently scaled GPU pools. The RDMA-based KV cache transfer library connecting those pools is a direct design input for inference cluster fabric — one that most fabric planning conversations aren't accounting for yet.

Key Points:

Disaggregates prefill (compute-bound input processing) and decode (memory-bandwidth-bound token generation) into separate pools that scale independently
NIXL, the KV cache transfer library, moves KV state between prefill and decode nodes via Remote Direct Memory Access — zero-copy, no CPU involvement, compatible with InfiniBand and Rocky v two fabrics
Smart request routing uses KV cache locality and prefix-sharing detection to minimize redundant prefill computation
Mixture-of-Experts-aware scheduling routes tokens across MoE model shards with shard-aware placement
Break-even point: approximately eight or more GPUs across multiple nodes with high-concurrency workloads

Deep Dive:

The fundamental tension in shared-pool inference is that prefill and decode are adversarial when they share hardware. A GPU processing a long-context prefill for a new agentic workflow is simultaneously starving every active decode user. Disaggregating into separate pools means you can tune each resource type independently to the actual workload mix — more prefill capacity for long-context agents, more decode capacity for high-throughput serving. Neither resource compromises for the other.

The networking implication is NIXL. Remote Direct Memory Access-based KV cache transfer means the bandwidth and latency of the fabric connecting prefill nodes to decode nodes is a first-class performance variable, not a secondary concern. On an InfiniBand or Rocky v two fabric, KV transfers can use GPU-direct RDMA — the same mechanism as NCCL collective communication during training. But the traffic profile is fundamentally different: smaller transfers, higher frequency, bidirectional between specific node pairs, more latency-sensitive, and not an all-reduce across the full cluster. If you're designing an inference cluster fabric and haven't asked "what are the Dynamo NIXL traffic patterns between my prefill and decode pools?", you haven't finished the design.

NVIDIA's strategic position is transparent — open-sourcing the orchestration layer drives adoption of Spectrum-X switches and ConnectX NICs whose RDMA paths Dynamo's architecture favors. That doesn't make the engineering wrong. It does mean understanding the hardware assumptions baked into the default configuration before treating Dynamo as vendor-neutral.

So What? When designing inference cluster fabric, add NIXL KV transfer bandwidth as a distinct traffic class in your requirements — model it separately from model weight loading and serving traffic, since prefill-to-decode flows dominate latency under high-concurrency agentic workloads.

SourcesNVIDIA Dynamo GitHub, UltraDyne AI Analysis, DEV Community

3. Spillway — Disaggregated In-Switch Buffering Cuts Multi-DC LLM Training Iteration Time by 14%

TL;DR: A new arXiv paper demonstrates that cross-datacenter collective communication for LLM training routinely collides with intra-DC traffic at the destination, causing congestion collapse. Spillway intercepts dropped packets into disaggregated switch buffers and drains them once congestion clears — no endpoint changes, no framework modifications, 14% iteration time recovery.

Key Points:

The failure mode: cross-DC all-reduce and all-gather collectives arrive at destination-side switches simultaneously with normal intra-DC traffic; multi-millisecond congestion control loops are too slow to prevent queue overflow and packet loss before collective stall
Spillway identifies idle disaggregated switch buffer capacity and claims it as a temporary holding area for dropped packets, draining them once the congestion wave passes
Transparent to endpoints: no changes to PyTorch, JAX, NCCL, RCCL, or any RDMA stack
Validated via large-scale end-to-end simulation AND a hardware prototype — two-stage validation is stronger than typical networking papers that stop at simulation
Quantitative result: up to 14% reduction in training iteration time; the collision pattern tested is described as "common in real workloads," not a synthetic worst case

Plate IISpillway: collective collision at the DC boundary

Cross-DC collective traffic (hot path) meets intra-DC flows at the destination leaf. Spillway absorbs dropped packets in disaggregated switch buffers and drains them post-congestion — without touching training frameworks or RDMA stacks.

Deep Dive:

Spillway complements the MRC transport paper (covered May 7) in a way that matters architecturally. MRC addresses within-fabric multipath reliability: how do you route RDMA flows across multiple network planes without head-of-line blocking? Spillway addresses a different boundary condition: what happens at the seam between wide-area inter-DC collective traffic and local intra-DC fabric traffic? These are adjacent failure modes in multi-DC LLM training, and the research community has addressed them within one week of each other. Together they sketch a layered reliability architecture for multi-cluster training — MRC handles within a DC pod, Spillway handles the DC-to-DC collision point.

The elegant piece is the mechanism. Disaggregated buffer architectures have been discussed primarily as a cost and capacity optimization. Spillway repurposes the idle capacity as an active congestion absorption mechanism. The switch already has the hardware; Spillway changes what it's used for. That's the category of solution that gets deployed — it doesn't require new silicon, just new firmware behavior. The "no endpoint changes" claim deserves scrutiny at production scale since the paper is arXiv and not yet through peer review, but the problem formulation is accurate, the direction is sound, and the hardware prototype validation is stronger evidence than simulation alone.

So What? If you're designing multi-DC training infrastructure, add collective-to-intra-DC collision headroom to your switch buffer sizing model and ask your NOS vendor whether disaggregated buffer modes are configurable on your deployed ASICs.

SourcesarXiv 2605.11852

№ 02·Networking

Networking & Architecture

Plate IIInetworking

Schematic leaf-spine fabric — explicit-path traffic flows across the spine plane, pods at the edges.

ipSpace SR-MPLS Hands-On Intro — First Real Lab Content in the Series

Following the May 11 series announcement (different URL), Ivan Pepelnjak published the first hands-on SR-MPLS lab content: a three-router containerized Arista EOS topology with IS-IS Prefix-SID advertisement, SRGB configuration at base 900000 with range 65536, and LFIB label operations — deployable in roughly one minute via GitHub Codespaces. Penultimate-hop popping is demonstrated with actual LFIB entries showing label swap and pop operations. The netlab automation layer brings the barrier to running a full SR-MPLS lab to approximately zero for anyone with a GitHub account. The series separately tracks SR-MPLS and SRv6 paths, reflecting their diverged adoption curves: SR-MPLS is the production SP workhorse; SRv6 uSID is where hyperscaler and AI fabric deployments are concentrating.

So What? Pull the ipspace/SR-workshop GitHub repo and run the three-router topology this week — if you haven't touched SR-MPLS label operations in the last two years, the lab exposes LFIB behavior that documentation alone won't give you.

SourcesipSpace.net

Arista AI Fabrics Taxonomy — Useful Framework, Thin on Specs

Arista's "The Many Facets of AI Fabrics" post lays out a three-tier scaling model: scale-up for unified intra-rack memory via low-latency non-blocking switches, scale-out for leaf-spine XPU reachability across racks, and scale-across for WAN/SRv6 uSID multi-cluster resource pooling. The SRv6 uSID reference for scale-across is the most substantive thread — connecting directly to SONiC 202505's static SRv6 uSID support (May 7) and the MRC three-layer SRv6 source-routing architecture from the same week. Three separate sources within one week all pointing at SRv6 uSID as the emerging AI cluster interconnect routing mechanism is a trend worth naming explicitly. What the post lacks: congestion control specifics, Rocky v two configuration parameters, oversubscription ratios, or any quantitative design guidance. Use the taxonomy to communicate design intent, not to build to it.

So What? The scale-up / scale-out / scale-across framework is useful shorthand for AI fabric design conversations. Just don't treat this post as a design reference — the parameters to actually build aren't there.

SourcesArista Networks Blog

№ 03·Automation

Automation & Programmability

Plate IVautomation

Source-of-truth pipeline — intent → diff → apply → verify, idempotent on every revolution.

Forward Networks Forward AI — Agentic Operations Grounded in Deterministic Math

Forward Networks launched Forward AI in April — an agentic system that plans and executes multi-step network operations workflows (incident triage, path tracing, compliance verification) while anchoring every recommendation to a mathematical digital twin rather than LLM probabilistic memory. The anti-hallucination architecture is the core differentiator: every claim Forward AI makes is directly verifiable against the twin during the conversation. The system won't assert facts the twin doesn't hold. Coverage spans Layer 2 through Layer 7 across on-premises, AWS, Azure, Google Cloud, and Kubernetes.

Forward Networks built a custom agentic framework rather than adopting LangGraph or AutoGen, citing that network topology context — VRFs, policy hierarchies, Layer 2/3 adjacencies, cross-cloud routes — is structurally incompatible with generic tool-call patterns those frameworks optimize for. This positions Forward AI at one end of the agentic NetOps design spectrum: deterministic grounding via mathematical verification. Infrahub (today's lead) occupies the data model end: graph-structured ground truth via MCP-native source of truth. Both are architecturally correct for different operational footprints.

So What? When evaluating agentic NetOps platforms, require a live demonstration where the agent makes a recommendation it cannot verify against a ground-truth data source. The system's response to that failure mode reveals more than any benchmark.

SourcesNetwork World

Pre-Change Network Validation at 58% Adoption — The 2026 Tool Landscape

EMA's 2026 State of the Network report finds 58% of teams now use a network modeling tool or digital twin for pre-change validation — a substantial jump from the manual-first baseline of three years ago. Network outages average $336,000 per hour; roughly 80% are preventable through better change management.

The current validation landscape has four distinct approaches used in combination: static verification (Batfish for invariant proofs in CI), enterprise-wide modeling (Forward Networks, covered above), configuration pipeline governance (Itential), and runnable mirror labs (EVE-NG, ContainerLab, Cisco Modeling Labs). A new entrant, NetPilot, claims to be the first AI-native productized runnable mirror lab, generating multi-vendor sandboxes from natural language descriptions in approximately two minutes. That's a vendor self-assessment from their own blog — treat the two-minute claim with appropriate skepticism, but the EMA adoption data is independently sourced.

So What? If your team is in the 42% still validating changes manually before pushing to production, the tooling landscape now has enough maturity that there's no credible cost argument left for staying there. Pick your entry point by scale: Batfish for CI-integrated invariant proofs, ContainerLab for runnable test environments, Forward Networks for enterprise multi-cloud modeling.

SourcesNetPilot Blog, citing EMA Research

№ 04·AI / ML

AI & Machine Learning

Plate Vai / ml

Embedding space — clusters carry related concepts; the highlighted query vector pulls its nearest neighbors.

Frontier AI Safety Testing Is Introducing Classic IAM Failures

A RUSI (Royal United Services Institute) report finds that third-party frontier AI safety evaluations have introduced the same identity and access management failures that enterprise security has fought for decades: stolen credentials, overprivileged accounts, poor credential lifecycle management, delayed revocation, and inconsistent authentication standards — all attached to systems being evaluated for catastrophic misuse potential.

The structural paradox is real: more thorough safety testing requires more external access, and every access pathway is a new attack surface. State-sponsored actors and insider threats are the primary exploitation vectors mapped in the report. The architectural lesson for infrastructure and security teams is direct: AI model weights and internals are a privileged access tier, not a developer resource. The same privileged access workstations, just-in-time access, and zero-standing-privilege controls that apply to production infrastructure access apply to model access pathways. The AI safety community is reinventing PAM without realizing it exists.

So What? Before granting any third-party evaluator or auditor access to AI model weights or internal systems, apply the same just-in-time access and zero-standing-privilege controls you'd apply to production infrastructure — this is a solved IAM problem arriving in an unfamiliar domain.

SourcesThe Register

№ 05·Datacenter

Datacenter & Infrastructure

Plate VIdatacenter

Datacenter row — per-rack utilization at a glance. Cool colors are slack; warmer fills are pressure.

AI Data Center Projects Now Take Seven-Plus Years from Start to Service

New PJM Interconnection data reveals that AI infrastructure projects entering service in 2025 took an average of more than seven years from inception to operations. The more significant finding: the bottleneck has shifted downstream. Projects previously stalled in the interconnection queue; they now spend roughly three years reaching an interconnection service agreement and another four years waiting after approval — transmission buildouts, substation capacity, and strained equipment supply chains are now the primary obstacles to energizing approved megawatts.

This changes the decision calculus for site selection and capacity planning. Queue position is no longer the primary variable; post-approval execution capacity is. Organizations evaluating new facility locations need to model transmission buildout timelines and substation availability alongside interconnection queue length.

So What? Add "post-approval transmission buildout timeline" to your site selection model for any new AI infrastructure facility — it's now the longer of the two primary constraints.

SourcesData Center Knowledge

Nscale's $790M Norway Financing — Infrastructure Capital Goes Utility-Scale

Nordic and European lenders financed $790 million into Norwegian AI infrastructure developer Nscale, including an accordion facility for a potential 115MW expansion. The financing structure — backed by ABN AMRO, DNB Bank, Eksfin, Nordea, and SEB — uses mechanisms more typical of utility and industrial infrastructure than technology investments. This signals a meaningful shift in how financial markets are pricing AI datacenter risk: not as speculative tech expansion but as long-duration industrial infrastructure tied to energy access. AI datacenter capital is maturing as an asset class, with implications for procurement timelines, depreciation assumptions, and the competitive dynamics of capacity-building.

So What? Utility-style financing at this scale signals that AI infrastructure is being treated as a multi-decade asset — factor that into facility planning horizons and capital cost modeling for any multi-year infrastructure investment.

SourcesData Center Knowledge

№ 06·Science

Science & Emerging Tech

Plate VIIscience

Field schematic — three-body stability under quasi-equal masses, drawn from the day's central result.

Oxford Demonstrates Quadsqueezing — Fourth-Order Quantum Effect, 100x Faster Than Predicted

TL;DR: University of Oxford researchers demonstrated quadsqueezing — a fourth-order quantum effect — for the first time, using a single trapped ion, and produced it more than 100 times faster than conventional approaches were projected to allow. Published in Nature Physics, May 1, 2026. Peer reviewed.

Key Points:

Fourth-order quantum squeezing generates states more deeply entangled and more sensitive than anything previously accessible
Two precisely controlled forces with tuned frequencies, phases, and strengths cascade non-commuting interactions to amplify higher-order effects on a single trapped ion
Sequential demonstration from the same platform: standard squeezing → trisqueezing → quadsqueezing — no separate hardware setups
Immediately applied to lattice gauge theory simulation, a particle physics problem
The 100x speed advantage makes previously theoretical experiments practical today

Why It Matters: Squeezed states are foundational in quantum sensing (gravitational wave detectors, atomic clocks) and quantum computing. Higher-order squeezed states are candidates for bosonic qubit encoding — an alternative to superconducting transmons with intrinsic noise advantages. The lattice gauge theory application is a direct preview of quantum advantage in materials modeling, arriving before fault-tolerant quantum computers. This technique could accelerate sensing applications years ahead of the full-scale quantum computing timeline.

So What? Monitor follow-up papers from the Oxford group on bosonic qubit encoding using higher-order squeezed states — this could expand the viable qubit candidate menu for next-generation processors and has near-term quantum sensing implications.

SourcesNature Physics (May 1, 2026), ScienceDaily

Time Crystals Make Contact — First Coupling to an External Device

Aalto University researchers created a time crystal in superfluid helium-3 and coupled it to an external mechanical oscillator — the first demonstration of a controllable time crystal. By adjusting the oscillator's frequency and amplitude, the team could tune the time crystal's behavior, analogous to optomechanical control. The helium-3 time crystal persisted for up to 108 oscillation cycles — several minutes, compared to microseconds-to-milliseconds for superconducting qubits. Time crystals were proposed in 2012 and first demonstrated around 2021; the missing piece has been whether they can do anything useful. Coupling to an external device is the prerequisite for integration into a real quantum system. Published in Nature Communications; the authors explicitly note potential applications in quantum computer memory and precision sensing.

So What? Track follow-up work from the Aalto group on alternative host materials — if the coupling mechanism generalizes to solid-state platforms, time crystals become quantum memory candidates with significant coherence advantages.

SourcesNature Communications, ScienceDaily

IonQ Delivers First 256-Qubit System, Reports 755% Q1 Revenue Growth

IonQ reported Q1 2026 revenue of $64.7 million — a 755% year-on-year increase — anchored by the first sale of its sixth-generation 256-qubit chip-based trapped-ion system to the University of Cambridge, with a quantum networking partnership covering computing, networking, sensing, and security IP development. Remaining performance obligations reached $470 million, up 554% year-on-year; full-year 2026 guidance raised to $260–270 million. Customer systems begin commissioning by end of Q2 2027. Independent hardware benchmarks are not yet published.

755% revenue growth from a system requiring near-absolute-zero operation signals the transition from physics demonstration to early commercial deployment. IonQ's published fault-tolerant architecture blueprint shows they're competing on error-correction roadmap alongside raw qubit count. The Q2 2027 commissioning timeline means real-world 256-qubit benchmarks will arrive before most organizations complete post-quantum cryptography migration.

So What? NIST-finalized ML-KEM and ML-DSA should be active projects now for any data with five or more years of confidentiality value — IonQ's timeline makes "start the migration when hardware arrives" a losing strategy.

SourcesIonQ Investor Relations, The Quantum Insider

№ 07·Quick Takes

Quick Takes

Cisco AgenticOps Crosswork — "Dark NOC" in a white paper. Cisco's agentic framework (campus/branch GA February 2026, datacenter targeting June) is now formally described using "NOCless," "white NOC," and "dark NOC" as target states in a published white paper. When the incumbent networking vendor publishes whitepapers about eliminating the NOC as a design goal, the skills-mix conversation for network teams — automation-capable engineers versus ticket processors — becomes structurally urgent, not theoretical.

NVIDIA Nemotron LTM — Open-source 30B telco reasoning model. Released February 2026 through GSMA's Open Telco AI initiative. Fine-tuned on telecom standards and synthetic operational logs for fault isolation, remediation, and change validation. Key architectural feature: structured reasoning traces that capture each tool call and decision, making fine-tuning on your own incident logs produce an auditable, domain-grounded model.

LLM 0.32a2 — OpenAI reasoning models migrate to /v1/responses endpoint. Simon Willison's LLM CLI alpha documents that GPT-5 class reasoning models now use the /v1/responses endpoint instead of /v1/chat/completions, enabling interleaved reasoning during tool calls. Any automation pipeline or agentic workflow hard-coded to the completions endpoint will silently degrade with reasoning-capable models. Audit and update now.

Google Inference Gateway — 70% TTFT reduction claimed [unverified]. Google Cloud's Inference Gateway uses ML-driven capacity-aware routing to reportedly cut time-to-first-token by over 70%, integrating with NVIDIA Dynamo and GKE. Claims from Google's own blog; independent benchmarks not yet published. Worth tracking.

SourcesCisco Crosswork, NVIDIA Blog, Simon Willison, Google Cloud Blog

№ 08·Watch Today

Watch Today

NANOG 97 Call for Presentations closes May 15 — if you have a network automation, AI fabric, or SRv6 deployment story, the submission window closes in two days
SRv6 uSID convergence arc — Arista (scale-across framing), SONiC 202505 (static uSID via SDN controller), and MRC (source-routing layer) all pointed at SRv6 uSID as the AI cluster interconnect routing mechanism within the last week; watch for IETF SPRING working group activity on production deployment patterns
Spillway arXiv peer review — the paper (arXiv 2605.11852) is strong but not yet peer reviewed; a journal submission would significantly strengthen the 14% iteration time claim

Domains researched: automation, ai-ml, networking, datacenter, science, security (no architectural updates this cycle — Patch Tuesday CVE content excluded per editorial policy) | RSS digest: 79 articles scored, top score 11.1 (Arista AI Fabrics) | Items published: 16 primary + 4 quick takes | Quality score: 4.5/5

AI Agents Get a Data Foundation — Infrahub, Dynamo, and Spillway

Top 3 Highlights

1. OpsMill Infrahub Closes $14M Series A — Source of Truth Gets an MCP Interface

2. NVIDIA Dynamo — Disaggregated Inference Orchestration at the Production Planning Horizon

3. Spillway — Disaggregated In-Switch Buffering Cuts Multi-DC LLM Training Iteration Time by 14%

Networking & Architecture

ipSpace SR-MPLS Hands-On Intro — First Real Lab Content in the Series

Arista AI Fabrics Taxonomy — Useful Framework, Thin on Specs

Automation & Programmability

Forward Networks Forward AI — Agentic Operations Grounded in Deterministic Math

Pre-Change Network Validation at 58% Adoption — The 2026 Tool Landscape

AI & Machine Learning

Frontier AI Safety Testing Is Introducing Classic IAM Failures

Datacenter & Infrastructure

AI Data Center Projects Now Take Seven-Plus Years from Start to Service

Nscale's $790M Norway Financing — Infrastructure Capital Goes Utility-Scale

Science & Emerging Tech

Oxford Demonstrates Quadsqueezing — Fourth-Order Quantum Effect, 100x Faster Than Predicted

Time Crystals Make Contact — First Coupling to an External Device

IonQ Delivers First 256-Qubit System, Reports 755% Q1 Revenue Growth

Quick Takes

Watch Today

Get the briefing in your inbox.