Skip to content
Morning Briefing · Saturday, April 25, 2026

Ansible vs Nornir vs Custom Python — Choosing Your Automation Layer

network-automation
Listen to the episode
Ansible, Nornir, or Custom Python — Choosing Your Automation Layer
16 min · 85 turns
Plate Ileaf · spine
Schematic leaf-spine fabric — explicit-path traffic flows across the spine plane, pods at the edges.
Top Highlights
№ 01·Top Highlights

Why This Matters Now

Network automation tooling is at an inflection point. For most of the last decade, the answer to "what should I use?" was "Ansible, probably" — and that was defensible. Ansible's agentless architecture, its massive module library, and its low barrier to entry made it the safe default. But the landscape has shifted underneath that assumption, and the shifts are accelerating.

Two things are happening simultaneously. First, Ivan Pepelnjak documented in December 2025 that Ansible's core network configuration modules — the src parameter on modules like arista.eos.eos_config and cisco.ios.ios_config — have been broken since Ansible core 12 and remained broken through core 13.1. The bug silently skips Jinja2 template rendering and reports success. Nobody caught it for four releases because, as Pepelnjak bluntly put it, "it's evident nobody is testing even the most common network configuration modules." Red Hat's investment is flowing into Ansible Automation Platform (the commercial product), not the open-source networking layer. Vendor collections for Nokia gRPC haven't been touched since 2021. The signal is clear.

Second, teams that moved to Python-native frameworks — Nornir in particular — are reporting dramatically better outcomes at scale. The architectural advantage isn't just speed (though it's that too). It's that Python-native automation plugs cleanly into the broader ecosystem: Pydantic for validation, Batfish for pre-deployment testing, NetBox and Nautobot as live inventory sources, asyncio for truly concurrent execution. When your automation logic lives in real Python, not Ansible's YAML DSL, you can test it, type-check it, benchmark it, and refactor it like any other software.

The question for 2026 isn't "should I automate?" It's "have I chosen the right layer for what I'm actually building?"

SourcesIvan Pepelnjak (ipSpace.net), Network to Code


Top Highlights
№ 02·Top Highlights

The Fundamentals

Ansible — The Declarative Foundation

Ansible launched in 2012, was acquired by Red Hat in 2015, and became the dominant network automation tool by roughly 2018. Its core bet was: you shouldn't need to write code to automate infrastructure. You write YAML playbooks describing desired state. Ansible figures out the execution. The agentless SSH model meant no client software on network devices, which mattered enormously for network teams who couldn't install agents on Cisco IOS boxes.

The networking module ecosystem grew quickly. By 2020, you had certified content for Arista, Cisco IOS/IOS-XE/NX-OS, Juniper Junos, Nokia SR OS, and many others. The ansible.netcommon collection provided shared utilities. ansible-lint enforced style and correctness. Red Hat built the commercial Ansible Automation Platform (AAP) with a web UI, RBAC, and scheduling on top.

The mental model is declarative and task-oriented: "run these tasks against these hosts in this order." Variables come from inventory (static files, dynamic plugins, or source-of-truth integrations). Templates are Jinja2. Logic is expressed through when:, loop:, block:, and rescue: constructs — which work fine for simple conditionals but become painful for complex branching logic.

Where Ansible still wins:

  • Teams with mixed Python/non-Python backgrounds where YAML is the lowest common denominator
  • Change execution workflows where the declarative model matches the operational model ("push this config to these devices")
  • Ansible Automation Platform shops that need RBAC, scheduling, audit trails, and approval workflows in a commercial package
  • Existing playbook libraries that work and don't need rewriting
  • Event-Driven Ansible (EDA) for reactive automation — a genuine architectural addition that connects event sources (Zabbix alerts, NetBox webhooks, Kafka streams) to remediation playbooks

Where Ansible struggles:

  • Complex branching logic: expressing if device is IOS-XE and software version is earlier than 17.6 and interface count is greater than 48 in YAML is awkward
  • Performance at scale: Ansible's fork-based parallelism and JSON serialization overhead between tasks adds up. On 500+ device inventories, run times balloon
  • Data manipulation: transforming and validating complex nested data structures in Jinja2 is a maintenance nightmare
  • Testing: Ansible playbooks are not natively unit-testable. Molecule helps, but it's bolted on
  • The broken modules problem: as of April 2026, if your workflow depends on src-based Jinja2 template rendering in *_config modules, you are on broken ground

Nornir — The Python-Native Framework

Nornir emerged from the Network to Code ecosystem around 2017-2018 as a direct response to Ansible's limitations. The premise: if your team knows Python, stop fighting the DSL. Write automation logic in Python. Get all of Python's ecosystem for free.

Architecturally, Nornir has three core concepts. First, the inventory — structured data about your devices, organized into hosts, groups, and defaults, stored in YAML files or pulled from any Python-accessible source. Second, tasks — Python functions that accept a Task object and return a Result. Third, the runner — controls how tasks are dispatched. The default threaded runner executes tasks across all matching hosts in parallel, with configurable concurrency via num_workers.

The plugin ecosystem fills the gaps: nornir-napalm for multi-vendor device interaction, nornir-netmiko for SSH-based operations, nornir-scrapli for high-performance async SSH/NETCONF, nornir-utils for filtering and result display. Nornir 3.x (the current generation) is a clean redesign that split the monolithic package into core plus plugins, which improved maintainability substantially.

The performance reality: Networklore's benchmark showed Nornir completing tasks roughly 100x faster than Ansible on equivalent workloads — not because the underlying SSH connections are faster, but because Nornir's parallelism is genuine. No serialization overhead between tasks. No fork-per-play overhead. Pure Python threads hitting devices concurrently. On a 200-device inventory, that difference is the gap between a 40-minute maintenance window and a 25-second script.

Where Nornir wins:

  • Python-proficient teams who want automation logic that's testable, type-annotatable, and debuggable like normal software
  • Scale: 200+ devices where Ansible's run times become operationally unacceptable
  • Complex data transformation: parse device output, validate against Pydantic models, diff against NetBox, generate structured reports — all in clean Python
  • Integration with the Python ecosystem: Batfish pre-flight checks, custom validators, REST API calls mid-workflow, asyncio for truly concurrent operation
  • Building reusable automation libraries rather than one-off playbooks

Where Nornir struggles:

  • The learning curve is steeper. You need real Python skills, not just "I can read Python"
  • No built-in execution engine for scheduling, RBAC, or audit trails — you build or buy those separately
  • Smaller community than Ansible. Fewer pre-built modules. More "figure it out yourself" energy
  • No equivalent to Ansible Tower/AAP for non-technical operators who need a UI

Custom Python — The Full Spectrum

"Custom Python" is a spectrum, not a single choice. At one end, you have raw paramiko-based SSH scripting from 2010 that someone maintains out of fear. At the other end, you have a production-grade automation platform built on scrapli-async, Pydantic v2 models, FastAPI for an internal automation API, and a full test suite with pytest. Both are "custom Python." The question is how much infrastructure you're willing to build.

The key libraries:

Netmiko — Kirk Byers' library that makes SSH connections to network devices sane. Handles connection setup, prompt detection, command sending, and output capture across dozens of device types. Mature, reliable, broadly used. Think of it as the SSH swiss army knife.

NAPALM (Network Automation and Programmability Abstraction Layer with Multivendor support) — Provides a normalized interface across vendors. get_interfaces(), get_bgp_neighbors(), load_merge_candidate() work the same way whether you're talking to Arista EOS, Cisco IOS, or Juniper Junos. Critical for multi-vendor environments where you want abstraction.

Scrapli — The modern, async-native alternative to Netmiko. Built for performance. Supports asyncio, has a clean plugin model, and handles NETCONF natively. If you're building a new Python automation tool in 2026 and care about scale, start with scrapli.

PyATS/Genie — Cisco's test and automation framework. Heavy, Cisco-centric, but unmatched for structured output parsing (Genie's parsers cover hundreds of show command outputs) and stateful testing.

The argument for going fully custom: you own the entire stack. No framework assumptions. No version conflicts with Ansible's Python environment. No waiting for a collection maintainer to fix your bug. The cost: you build everything yourself, including the parts you probably shouldn't be spending time on.

SourcesNetwork to Code, Networklore, APNIC Blog, Kirk Byers (Netmiko)


Top Highlights
№ 03·Top Highlights

The Practical Section

Decision Table

DimensionAnsibleNornirCustom Python
Team Python skill requiredLowMedium-HighHigh
Performance at scale (500+ devices)PoorExcellentExcellent
Pre-built vendor modulesExtensive (but some broken)Plugin ecosystemYou build or assemble
Complex conditional logicPainfulNaturalNatural
Testing/validationAwkward (Molecule)Standard pytestStandard pytest
Commercial supportYes (AAP)Community onlyCommunity only
Scheduling + RBAC + UIYes (AAP)You buildYou build
Data transformationPainful (Jinja2 DSL)Easy (Python)Easy (Python)
Integration with NetBox/NautobotGood (dynamic inventory)Excellent (Python API)Excellent (Python API)
Audit trailBuilt into AAPBuild itBuild it
Event-driven/reactive patternsYes (EDA)You buildYou build

When Each Tool Wins

Use Ansible when: You have a mixed-skill team and the automation needs are primarily configuration push and change execution. The declarative model maps cleanly to "push this template to these devices in this maintenance window." If you're in an Ansible Automation Platform shop, the scheduling, RBAC, and approval workflow are genuinely valuable for operational governance. EDA is a real advantage for reactive use cases — wiring Zabbix or Netbox change events to remediation playbooks without writing your own event loop.

One caveat for 2026: do not rely on the src-parameter template workflow in *_config modules until Red Hat explicitly fixes and announces the fix. If you're currently using this pattern, validate immediately whether your playbooks are actually applying templates. The bug reports success while silently skipping the template render.

Use Nornir when: Your team knows Python and you're operating at any meaningful scale. "Meaningful scale" in practice means: more than 50 devices where sequential execution is painful, or any workflow where you need complex data transformation, validation, or integration with other Python tools. The typical high-value pattern: use Nornir to gather state (via nornir-scrapli or nornir-napalm), transform results into Pydantic models, diff against NetBox/Nautobot as source of truth, generate structured reports or trigger remediations.

The hybrid pattern that the Network to Code community has been advocating — and that the April 14 coverage here confirmed is gaining real traction — is: Nornir for data collection and validation, Ansible for final configuration push. You get Nornir's Python flexibility and performance for the complex parts, Ansible's change execution model for the simple push at the end. This isn't compromise, it's layering.

Use custom Python when: You need something that frameworks don't give you, or you're building a product-grade automation platform that others will consume. Building an internal automation API? FastAPI plus scrapli-async plus Pydantic is a cleaner foundation than bolting Nornir or Ansible into a web service. Building network testing infrastructure? pyATS plus Batfish plus custom pytest fixtures. Building an automation library that other teams import? Pure Python with no framework assumptions.

The signal to go fully custom is usually one of two things: (1) you've outgrown what any framework assumes, or (2) you're building something that will become a product, not a set of scripts.

The Hybrid Architecture That Actually Works

Here is the reference pattern that handles roughly 80% of real-world network automation scenarios:

Layer 1 — Source of Truth: NetBox or Nautobot as the authoritative inventory. Nornir's inventory plugin pulls live data at runtime. No static YAML files that drift from reality.

Layer 2 — Validation: Nornir task runs read-only state collection against the network. Output parsed via NAPALM getters or Genie parsers. Results validated against Pydantic models. Diffs generated against expected state from source of truth.

Layer 3 — Pre-flight testing: Batfish analyzes proposed configuration changes for correctness before anything touches a device. This is the most important layer most teams skip.

Layer 4 — Change execution: Ansible playbooks push validated configurations. The declarative model works well here. Change records auto-generated. The execution layer is the one place Ansible's "task list" model maps cleanly to operational reality.

Layer 5 — Verification: Post-change Nornir run confirms state matches intent. Automated diff against pre-change snapshot. Human gets a structured report, not a wall of terminal output.

This is not a theoretical architecture. It's the pattern Ivan Pepelnjak, Network to Code, and teams like the one building Nautobot have been converging on. The tools each occupy the layer where they have structural advantages.

Migration Paths

From pure Ansible to hybrid: Identify the workflows where Ansible's limitations are actually costing you time. Usually it's data collection and transformation, not config push. Start by replacing Ansible facts gathering with Nornir. Keep the Ansible execution layer. Add Batfish for pre-flight validation. This is a three-month project for a motivated engineer, not a two-year migration.

From custom scripts to Nornir: If you have a collection of Netmiko scripts, the migration to Nornir is straightforward. Nornir's nornir-netmiko plugin uses Netmiko under the hood. Wrap existing connection logic in Nornir tasks, move inventory to structured YAML or a NetBox plugin, add the thread pool runner. You get parallelism essentially for free.

From Ansible to EDA: If your team is invested in Ansible but wants reactive/event-driven patterns, EDA is the path of least resistance. The Zabbix EDA integration, NetBox webhook triggers, and Kafka source plugins are all production-ready as of early 2026. This extends Ansible without requiring a Python rewrite.

SourcesIvan Pepelnjak (ipSpace.net), Network to Code, NetBox Labs, Nautobot


Top Highlights
№ 04·Top Highlights

Where It's Going

AI as the New Abstraction Layer

The tools above are the execution layer. What's changing rapidly is the intent layer above them. Ansible Lightspeed (generally available for Red Hat subscribers as of early 2026) and IBM watsonx integration let engineers describe what they want in plain language and get YAML playbook suggestions in VS Code. This is not magic — the generated YAML still needs review and testing — but it meaningfully lowers the barrier for less experienced engineers.

More interestingly, Nornir's Python-native model turns out to be a better substrate for AI-assisted automation than Ansible's DSL. When an LLM generates automation logic, that logic is usually Python. Plugging LLM-generated Python tasks into a Nornir runner, validating them with Pydantic, and running them through Batfish before execution is a coherent safety architecture. Plugging LLM-generated YAML into Ansible's config module pipeline — especially given the current breakage — is not.

The Itential FlowAI and Cisco Agentic Workflows for Meraki announcements earlier this month point to the same direction: natural language intent maps to a governed workflow execution engine, with the automation framework (Ansible, Nornir, or custom) as the implementation layer underneath. The abstraction is moving up, not down.

Nornir's Async Future

Nornir 3.x's plugin model was the necessary prerequisite for what comes next: truly async execution. nornir-scrapli already supports asyncio. Teams at scale are starting to build Nornir-based automation that runs hundreds of concurrent device connections without the overhead of Python threads. For very large inventories — think 2,000+ device service provider scale — async Nornir is meaningfully different from threaded Nornir, and the ecosystem is maturing to support it.

The Ansible Fork Question

The Ansible community fragmentation question is real. Ansible AWX (the open-source Tower), EDA (open source), and the community ansible-core are one track. Ansible Automation Platform is the Red Hat commercial product. The network module crisis documented by Pepelnjak suggests the open-source networking track is effectively in maintenance mode — maybe less. Teams building new network automation in 2026 should not assume ansible.netcommon will be a stable foundation without testing every release.

The Tools That Will Matter in 24 Months

The tools to watch:

  • Nautobot's nornir integration (nautobot-app-nornir) — direct pipeline from source of truth to Nornir execution. Note that the dispatcher_mapping is being removed in the next major version; migrate to network_driver now.
  • pyGNMI and gNMIc for model-driven telemetry and configuration via gNMI — the path forward for vendors with mature OpenConfig/YANG support. This bypasses both Ansible and Nornir's SSH assumptions entirely.
  • Scrapli-async as a Netmiko replacement for teams building new automation at scale.
  • Batfish as CI/CD infrastructure, not just a testing tool — continuous pre-change validation in every automation pipeline.
  • LLM-assisted task generation in Nornir workflows — not as a replacement for engineers, but as an accelerator for boilerplate task writing. The pattern of "generate, review, test, commit" is already working for teams that have built the validation infrastructure.

SourcesRed Hat (Ansible Lightspeed), Itential, Cisco, Nautobot, Network to Code


Top Highlights
№ 05·Top Highlights

Further Reading

  1. "Has Ansible Team Abandoned Network Automation?" — Ivan Pepelnjak, ipSpace.net (December 2025) The canonical reference for the src parameter breakage and what it signals about Red Hat's investment in open-source network automation. Required reading before starting any new Ansible-based project.

  2. "Nornir vs. Ansible: How to Choose?" — NetBox Labs Blog Practical decision framework with specific selection criteria. Includes integration points with NetBox inventory for both tools.

  3. "Ansible vs Nornir: How Python Makes Automation Easier" — CBT Nuggets Good explanation of the architectural differences for engineers who are stronger on networking than Python.

  4. "Network Automation Tools Comparison — Ansible vs Terraform vs Python vs Nornir" — Networkershome Broader landscape view including where Terraform fits (it doesn't replace Ansible/Nornir for device-level automation, but the confusion is common).

  5. "Ansible vs Nornir Speed Challenge" — Networklore The benchmark that quantified the roughly 100x performance differential. Methodology matters for interpreting the number, but the directional finding is robust.

  6. "Automation Tools: Paramiko, Netmiko, NAPALM, Ansible, Nornir or...?" — APNIC Blog (2023) Still the clearest explanation of the full Python networking library stack from first principles. Slightly dated but foundational.

  7. "Introduction to Event-Driven Ansible and Nautobot" — Network to Code Blog The practical EDA integration pattern that makes Ansible's reactive story concrete. Good if your team is Ansible-first and wants to extend rather than rewrite.

  8. "How to Use Nornir as a Python Alternative to Ansible" — OneUptime Blog (March 2026) Recent walkthrough of migrating from Ansible thinking to Nornir thinking. Good for engineers making the transition.


Top Highlights
№ 06·Top Highlights

Topic: Ansible vs Nornir vs custom Python — choosing the right automation layer Edition: Saturday Deep Dive — April 25, 2026 Searches used: 5 (+ 2 deep-fetch fetches) Quality score: 4/5 — strong research depth, concrete decision framework, actionable migration paths. Slight deduction: direct benchmark data for Nornir 3.x specifically in 2025-2026 was not available from primary sources; the 100x figure is community-reported and directionally reliable but not from a controlled recent study.

Subscribe

Get the briefing in your inbox.

One email per weekday morning. Same writing, same sources — no audio required.