Amaze Networks Intelligence Briefing — Tuesday, March 31, 2026

№ 01·Top Highlights

Top 3 Highlights

1. Amazon and OpenAI Build the Memory Layer for Agentic AI — on AWS Bedrock

TL;DR: The Amazon-OpenAI partnership has delivered something more consequential than another model launch: a Stateful Runtime Environment running natively in Amazon Bedrock, designed to give AI agents persistent memory, tool state, and identity across multi-step workflows.

Key Points:

Announced February 27, the Stateful Runtime brings working context — memory, tool use, environment state, identity/permission boundaries — into a single AWS-native orchestration layer
AWS becomes the exclusive third-party cloud distribution provider for OpenAI Frontier models; Amazon is investing $50 billion total in OpenAI
OpenAI is committed to consuming approximately two gigawatts of Trainium capacity through AWS — the infrastructure implications are enormous
This is not a model story. It is an infrastructure story: whoever controls the stateful execution layer controls where agentic workloads run and how they scale
Runtime is expected to GA in the coming months; enterprise design patterns around it are already forming

Deep Dive:

The missing piece in every agentic AI deployment has been state. Individual model calls are stateless; building multi-step agents requires developers to manually stitch together memory, tool results, workflow position, and credential scope across separate API calls. The Bedrock Stateful Runtime makes this native — the platform carries context forward automatically, so an agent that makes a tool call today can pick up where it left off without a developer writing glue code.

The network and infrastructure implications are non-trivial. OpenAI committing to two gigawatts of Trainium capacity means AWS is building dedicated AI training infrastructure at gigawatt scale. That is a long-term bet that AWS training infrastructure will underpin OpenAI's next model generations — and that the hyperscaler and the AI lab are now deeply coupled on both the model and the compute sides.

For network engineers, the Stateful Runtime changes how agent traffic flows. Stateful sessions are long-lived, not transactional. They have different latency profiles, different session table sizing requirements, and they create persistent connectivity between agent orchestration layers and tool endpoints. This is the kind of architectural shift that shows up in network design three years later when everyone is wondering why their east-west segmentation policies don't account for long-lived AI agent sessions.

So What? The Amazon-OpenAI stateful runtime is where agentic AI gets its infrastructure plumbing — and network engineers who understand long-lived session management and east-west traffic flows will have an advantage designing the networks those agents run on.

SourcesOpenAI Blog / AWS Blog — openai.com/index/introducing-the-stateful-runtime-environment-for-agents-in-amazon-bedrock/

2. Spacelift Intent: Natural Language Infrastructure Provisioning That Actually Ships to Production

TL;DR: Spacelift launched Spacelift Intent at KubeCon Europe in Amsterdam, making it the first open-source, agentic natural language model for cloud infrastructure provisioning — and unlike most "AI for IaC" pitches, it works on top of existing Terraform providers with full policy enforcement and audit trails intact.

Key Points:

Spacelift Intent lets developers request infrastructure in plain English ("spin up a dev environment for this branch") while DevOps and platform teams retain full visibility, policy control, and auditability
Built on existing OpenTofu/Terraform providers — no new abstractions, no HCL replacement; it generates the IaC and runs it through the same GitOps pipeline
Demonstrated live at KubeCon CloudNativeCon Europe, March 23-26, Amsterdam
Explicitly designed to complement GitOps, not replace it: fast, low-ceremony provisioning for development workflows; GitOps remains the production system of record
Spacelift Intelligence (the broader AI suite) launched March 18, 2026

Deep Dive:

Every infrastructure-as-code platform has started stapling an AI chatbot onto its UI. Spacelift's approach is different in one important way: it does not try to replace the IaC workflow. It adds a natural language front door that generates valid Terraform, runs it through the existing policy engine, and commits the result back to the GitOps pipeline. The production record is still in Git. The policies still run. The audit trail is intact.

This matters for network automation shops because the adoption blocker for GitOps in network teams is not the philosophy — it is the ceremony. Writing HCL or YAML for a VLAN change or a BGP policy adjustment is overhead that discourages adoption. A natural language layer that generates the config and puts it into a pull request for review could break that logjam. The question is whether Spacelift extends Intent to network-specific providers — and given that Terraform already has providers for most major NOS platforms, the path is there.

The broader pattern at KubeCon this year was AI-as-platform-accelerator rather than AI-as-replacement. Teams are not replacing their GitOps pipelines with AI; they are using AI to reduce the friction of feeding those pipelines. For network engineers, that is the right framing: AI lowers the barrier to getting configs into Git, validation still happens with Batfish or pyATS, and humans still approve the pull request.

So What? If your network automation adoption is stuck because writing HCL or YAML for routine changes is too much overhead, Intent-style natural language provisioning is the pattern to watch — and it is shipping now, not in a roadmap.

SourcesSpacelift Blog — spacelift.io/blog/introducing-spacelift-intelligence

3. NVIDIA Vera Rubin Goes to Orbit — Space-Grade AI Compute for Orbital Data Centers

TL;DR: NVIDIA announced the Vera Rubin Space-1 module on March 16, bringing space-hardened AI compute to orbital data centers with up to twenty-five times the AI compute of an H100 GPU — and partners like Kepler Communications and Planet Labs are already designing missions around it.

Key Points:

The Space-1 module uses the Vera Rubin GPU in a radiation-hardened, solar-powered form factor designed for orbital operation
Twenty-five times the AI inferencing compute of the H100 GPU, enabling on-orbit processing of geospatial intelligence and autonomous space operations
Partners include Aetherflux, Axiom Space, Kepler Communications, Planet Labs, Sophia Space, and Starcloud
IGX Thor, Jetson Orin, and RTX PRO 6000 Blackwell Server Edition are available now for ground and near-orbit use; Space-1 module availability is to be announced
This is not theoretical: Planet Labs processes satellite imagery; real-time on-orbit AI inference means they process data before the satellite even downlinks it

Deep Dive:

The traditional pipeline for satellite data is: acquire, downlink, process on the ground. That pipeline has a fundamental bottleneck — you can only downlink what you can fit into the comm window, and the raw data volumes from modern Earth observation satellites are staggering. Inference at the edge — on the satellite itself — is the architectural answer to that problem. The Vera Rubin Space-1 module makes that feasible with H100-class AI compute in a form factor that survives orbit.

The networking angle here is real and underappreciated. Orbital data centers change the topology of where compute lives. If inference happens in orbit, the data that flows back to Earth is processed, structured, and actionable — not raw. That reshapes what the ground network needs to handle: lower bandwidth for downlinks, but richer, lower-latency results. Kepler Communications, one of the launch partners, operates a radio-frequency relay satellite constellation, essentially a data transport layer for other satellites. Putting AI compute on Kepler nodes changes them from dumb pipes to intelligent processing relays.

NVIDIA is building the same playbook at every tier — data center with Blackwell, edge with Jetson, automotive with DRIVE, and now space with Space-1. The question is whether a NVIDIA-dominated AI compute stack from ground to orbit concentrates too much supply chain risk in one vendor. For now, the fact that NVIDIA is shipping twenty-five-times H100 performance for orbital deployment is impressive regardless of the competitive dynamics.

So What? On-orbit inference will change satellite data delivery architecture — network engineers working on ground station design, satellite communication infrastructure, or geospatial data pipelines should start thinking about how processed-results-downlink differs from raw-data-downlink in terms of throughput, latency, and protocol requirements.

SourcesNVIDIA Newsroom — nvidianews.nvidia.com/news/space-computing, Tom's Hardware analysis

№ 02·Networking

Networking

Plate IInetworking

Schematic leaf-spine fabric — explicit-path traffic flows across the spine plane, pods at the edges.

BGP and EVPN: Still the Fabric, Still Maturing

Enterprise SONiC EVPN-VXLAN continues to see deployment writeups. STORDIS published a detailed technical post this week on EVPN Multihoming — specifically the ESI (Ethernet Segment Identifier) and uplink tracking mechanics that replace MCLAG in modern SONiC deployments. The key takeaway: EVPN Multihoming entered Enterprise SONiC at version 4.2 as a standards-based alternative to MCLAG and is production-ready, but the operational model for ESI configuration and uplink failure handling requires understanding the BGP EVPN control plane at a deeper level than MCLAG did.

For Cody's work context specifically: Dell SONiC is in this family. The EVPN Multihoming + ESI pattern is directly relevant to the Border Leaf architecture work, and the STORDIS documentation on CLOS with Enterprise SONiC is worth bookmarking as a reference for the S-series and Z-series switch deployments.

Actionable takeaway: If you are running MCLAG today for access redundancy and considering SONiC, the EVPN Multihoming path is the right migration target — read the STORDIS writeup on ESI before you spec the migration.

№ 03·AI / ML

AI and Machine Learning

Plate IIIai / ml

Embedding space — clusters carry related concepts; the highlighted query vector pulls its nearest neighbors.

Gemini 3 Deep Think Hits Eighty-Four Percent on ARC-AGI-2

Google's updated Deep Think reasoning mode for Gemini 3 set a new benchmark high of eighty-four point six percent on ARC-AGI-2 — the hardest publicly available benchmark for AI reasoning, verified by the ARC Prize Foundation. It also scored gold-medal level on the 2025 International Physics Olympiad and Math Olympiad written sections, and reached 48.4 percent on Humanity's Last Exam (without tools), the highest public score to date.

Early API access is now open to select researchers, engineers, and enterprises via the Gemini API. This matters because the gap between what is possible with reasoning-optimized models and what most enterprise teams are actually deploying has never been wider — most production AI deployments are still using models that are three or four generations behind the current frontier in reasoning capability.

Actionable takeaway: If your team is using AI for network troubleshooting, config analysis, or RCA, the reasoning gap between standard chat models and reasoning-mode models is large enough to justify testing Deep Think or comparable models (o3, Claude Sonnet with extended thinking) on your specific use cases — the improvement on multi-step problems is often dramatic.

Qwen 3.5-397B Multimodal MoE: Open Weights, One Million Token Context

Alibaba's Qwen 3.5 flagship model — three hundred ninety-seven billion total parameters, seventeen billion active per forward pass — extends the open-weights frontier in two important directions: native multimodality (text, images, and video through early fusion, with support for two hundred one languages) and a context window of up to one million tokens.

For on-premises AI deployment, the MoE architecture means the active parameter count is manageable with modern GPU clusters. Seventeen billion active parameters per forward pass is similar to running a mid-size dense model, even though the total model size is much larger. The practical implication: shops that have ruled out running open-weights models due to compute cost should revisit that calculation with MoE architectures.

Actionable takeaway: If your organization has on-premises GPU capacity and has been waiting for an open-weights model worth deploying, Qwen 3.5-397B and Mistral Large 3-675B are both worth a serious evaluation — the open-weights frontier has caught up significantly with proprietary models for most enterprise workloads.

Mistral Large 3-675B: Ninety-Two Percent on HumanEval, Three Times the Throughput

Mistral's Large 3 model (six hundred seventy-five billion total parameters, forty-one billion active) is now showing updated March benchmarks: ninety-two percent pass-at-one on HumanEval for Python, competitive SWE-Bench results, and a forty percent reduction in end-to-end latency in optimized deployments compared to Mistral Small 3. For network automation specifically, a ninety-two percent HumanEval score means this is a model worth testing for Python script generation, Jinja2 template construction, and Nornir task development.

№ 04·Datacenter

Datacenter

Plate IVdatacenter

Datacenter row — per-rack utilization at a glance. Cool colors are slack; warmer fills are pressure.

Oracle Stargate Abilene: One Point Two Gigawatts, Four Hundred Fifty Thousand GPUs

The scale of Oracle's Stargate I campus in Abilene, Texas — now with first buildings operational since September 2025 and six more slated for mid-2026 completion — represents the most concrete physical manifestation of the $700 billion hyperscaler capex wave. One point two gigawatts of capacity, four hundred fifty thousand NVIDIA GB200 GPUs, and six buildings still under construction make it among the largest single AI compute deployments ever planned.

The power density at Abilene is significant: individual racks are now running over one hundred kilowatts, requiring liquid cooling as a default. Microsoft's Fairwater campus in Atlanta uses closed-loop liquid cooling that eliminates operational water consumption entirely — a design choice that is becoming the standard for new AI datacenter construction, not just an option.

Actionable takeaway: When you are next speccing a datacenter interconnect for an AI tenant, liquid cooling architecture changes the physical plant layout in ways that affect structured cabling runs, cooling infrastructure adjacency, and power distribution — start those conversations with facilities early.

№ 05·Security

Security

Plate Vsecurity

Zero-trust egress — credentials are injected at the proxy boundary, never reaching the client runtime.

Stateful Agentic Sessions Create New Segmentation Challenges

The Amazon-OpenAI Bedrock Stateful Runtime (see lead story) is worth a separate security note. Long-lived agent sessions have a fundamentally different threat profile than transactional API calls. A stateful agent that retains memory and tool access across a workflow can accumulate permissions and context over time — and if that session is compromised, the blast radius is larger than a compromised individual API key.

The architectural lesson: zero-trust policy for agentic workloads needs to account for session lifecycle, not just per-request authentication. The same microsegmentation patterns that work for microservices — least-privilege, short-lived credentials, lateral movement constraints — apply to AI agent sessions, but the implementation needs to handle the temporal dimension of stateful context.

Actionable takeaway: If your organization is deploying or planning to deploy agentic AI workflows, add session lifecycle management to your zero-trust review — what happens when a long-lived agent session is compromised mid-workflow is not a question most current policies answer.

№ 06·Science

Science

Plate VIscience

Field schematic — three-body stability under quasi-equal masses, drawn from the day's central result.

Quantum Computing's Credibility Problem — and Why That Is Actually Good News

A paper published in late March 2026 and highlighted by ScienceDaily examined signals from earlier high-profile quantum computing claims and found they could be explained by simpler mechanisms — reinforcing a pattern the field is now grappling with seriously. The Quantum Computing Report notes this is happening alongside genuine breakthroughs: the DOE national quantum research centers confirmed a scalability milestone in February, QpiAI's hardware decoder cut error correction latency from sixty microseconds to one point five microseconds, and IBM achieved ninety-eight percent logical qubit fidelity on a one hundred twenty-seven qubit processor.

The pattern here is field maturation, not collapse. The benchmarks and KPIs for what constitutes a real quantum breakthrough are getting stricter, and papers are being scrutinized more rigorously. That is how scientific fields grow up. The claims that survive this scrutiny are more meaningful because of it.

Actionable takeaway: When evaluating quantum computing vendor claims or research papers, look for independent replication and hardware-decoder evidence of real-time error correction — those are the two signals that distinguish genuine progress from noise right now.

№ 07·Quick Takes

Quick Takes

Spacelift at KubeCon: The broader Spacelift Intelligence suite (of which Intent is one component) also includes AI-powered change analysis, drift detection, and pull request summarization — worth watching if your team uses Spacelift for IaC orchestration.
Connectivity-as-code momentum: The Spacelift Intent story is part of a broader trend of AI-assisted IaC that is showing up across the stack — HashiCorp, Pulumi, and now Spacelift all have natural language or AI-assisted provisioning features either shipped or in preview.
MoE is the architecture: Three of the biggest open-weights model releases in Q1 2026 — Qwen 3.5, Mistral Large 3, and Meta's Llama 4 Scout — all use Mixture of Experts. The era of dense parameter scaling is over; active parameter efficiency is the new competition axis.
Hyperscaler capex reality check: The $700 billion figure for 2026 datacenter capex is beginning to show up in power grid capacity requests — multiple utilities have reported record interconnection queue volumes from hyperscaler customers in Q1 2026.

№ 08·Watch Today

Watch Today

Oracle Stargate Abilene — second-quarter 2026 building completions are the next concrete milestone in the hyperscaler AI build-out; watch for power grid approval updates from Texas ERCOT
Amazon-OpenAI Bedrock Stateful Runtime — GA date still TBD; enterprise architecture patterns are forming now; worth reading the design documentation when it publishes
KubeCon Europe 2026 afterglow — session recordings from March 23-26 Amsterdam are starting to hit YouTube; the GitOps and platform engineering tracks had strong coverage of AI-assisted infrastructure
Qwen 3.5 on-premises deployments — early adopter benchmarks on specific hardware configs (especially multi-GPU inference) are starting to appear; the MoE efficiency claims deserve real-world validation

№ 09·Automation

Pipeline Stats

Plate VIIautomation

Source-of-truth pipeline — intent → diff → apply → verify, idempotent on every revolution.

Research method: Direct web search (no RSS digest for 2026-03-31)
Domains covered: AI/ML, Automation/IaC, Networking, Datacenter, Science/Quantum
Items researched: 12 major stories
Items published: 10 (2 deduplication-filtered — Gemini 3.1 Pro benchmark covered 2026-03-27, Broadcom/UEC covered 2026-03-30)
Quality score: 4/5
Dedup status: Clean — all items outside 72-hour cooldown window

Get the briefing in your inbox.