Ansible vs Nornir vs Custom Python — Choosing Your Automation Layer
Why This Matters Now
Two things are happening simultaneously. First, Ivan Pepelnjak documented in December 2025 that Ansible's core network configuration modules — the src parameter on modules like arista.eos.eos_config and cisco.ios.ios_config — have been broken since Ansible core 12 and remained broken through core 13.1. The bug silently skips Jinja2 template rendering and reports success. Nobody caught it for four releases because, as Pepelnjak bluntly put it, "it's evident nobody is testing even the most common network configuration modules." Red Hat's investment is flowing into Ansible Automation Platform (the commercial product), not the open-source networking layer. Vendor collections for Nokia gRPC haven't been touched since 2021. The signal is clear.
Second, teams that moved to Python-native frameworks — Nornir in particular — are reporting dramatically better outcomes at scale. The architectural advantage isn't just speed (though it's that too). It's that Python-native automation plugs cleanly into the broader ecosystem: Pydantic for validation, Batfish for pre-deployment testing, NetBox and Nautobot as live inventory sources, asyncio for truly concurrent execution. When your automation logic lives in real Python, not Ansible's YAML DSL, you can test it, type-check it, benchmark it, and refactor it like any other software.
The question for 2026 isn't "should I automate?" It's "have I chosen the right layer for what I'm actually building?"
SourcesIvan Pepelnjak (ipSpace.net), Network to Code
The Fundamentals
Ansible — The Declarative Foundation
Ansible launched in 2012, was acquired by Red Hat in 2015, and became the dominant network automation tool by roughly 2018. Its core bet was: you shouldn't need to write code to automate infrastructure. You write YAML playbooks describing desired state. Ansible figures out the execution. The agentless SSH model meant no client software on network devices, which mattered enormously for network teams who couldn't install agents on Cisco IOS boxes.
The networking module ecosystem grew quickly. By 2020, you had certified content for Arista, Cisco IOS/IOS-XE/NX-OS, Juniper Junos, Nokia SR OS, and many others. The ansible.netcommon collection provided shared utilities. ansible-lint enforced style and correctness. Red Hat built the commercial Ansible Automation Platform (AAP) with a web UI, RBAC, and scheduling on top.
The mental model is declarative and task-oriented: "run these tasks against these hosts in this order." Variables come from inventory (static files, dynamic plugins, or source-of-truth integrations). Templates are Jinja2. Logic is expressed through when:, loop:, block:, and rescue: constructs — which work fine for simple conditionals but become painful for complex branching logic.
Where Ansible still wins:
- Teams with mixed Python/non-Python backgrounds where YAML is the lowest common denominator
- Change execution workflows where the declarative model matches the operational model ("push this config to these devices")
- Ansible Automation Platform shops that need RBAC, scheduling, audit trails, and approval workflows in a commercial package
- Existing playbook libraries that work and don't need rewriting
- Event-Driven Ansible (EDA) for reactive automation — a genuine architectural addition that connects event sources (Zabbix alerts, NetBox webhooks, Kafka streams) to remediation playbooks
Where Ansible struggles:
- Complex branching logic: expressing
if device is IOS-XE and software version is earlier than 17.6 and interface count is greater than 48in YAML is awkward - Performance at scale: Ansible's fork-based parallelism and JSON serialization overhead between tasks adds up. On 500+ device inventories, run times balloon
- Data manipulation: transforming and validating complex nested data structures in Jinja2 is a maintenance nightmare
- Testing: Ansible playbooks are not natively unit-testable. Molecule helps, but it's bolted on
- The broken modules problem: as of April 2026, if your workflow depends on
src-based Jinja2 template rendering in*_configmodules, you are on broken ground
Nornir — The Python-Native Framework
Nornir emerged from the Network to Code ecosystem around 2017-2018 as a direct response to Ansible's limitations. The premise: if your team knows Python, stop fighting the DSL. Write automation logic in Python. Get all of Python's ecosystem for free.
Architecturally, Nornir has three core concepts. First, the inventory — structured data about your devices, organized into hosts, groups, and defaults, stored in YAML files or pulled from any Python-accessible source. Second, tasks — Python functions that accept a Task object and return a Result. Third, the runner — controls how tasks are dispatched. The default threaded runner executes tasks across all matching hosts in parallel, with configurable concurrency via num_workers.
The plugin ecosystem fills the gaps: nornir-napalm for multi-vendor device interaction, nornir-netmiko for SSH-based operations, nornir-scrapli for high-performance async SSH/NETCONF, nornir-utils for filtering and result display. Nornir 3.x (the current generation) is a clean redesign that split the monolithic package into core plus plugins, which improved maintainability substantially.
The performance reality: Networklore's benchmark showed Nornir completing tasks roughly 100x faster than Ansible on equivalent workloads — not because the underlying SSH connections are faster, but because Nornir's parallelism is genuine. No serialization overhead between tasks. No fork-per-play overhead. Pure Python threads hitting devices concurrently. On a 200-device inventory, that difference is the gap between a 40-minute maintenance window and a 25-second script.
Where Nornir wins:
- Python-proficient teams who want automation logic that's testable, type-annotatable, and debuggable like normal software
- Scale: 200+ devices where Ansible's run times become operationally unacceptable
- Complex data transformation: parse device output, validate against Pydantic models, diff against NetBox, generate structured reports — all in clean Python
- Integration with the Python ecosystem: Batfish pre-flight checks, custom validators, REST API calls mid-workflow, asyncio for truly concurrent operation
- Building reusable automation libraries rather than one-off playbooks
Where Nornir struggles:
- The learning curve is steeper. You need real Python skills, not just "I can read Python"
- No built-in execution engine for scheduling, RBAC, or audit trails — you build or buy those separately
- Smaller community than Ansible. Fewer pre-built modules. More "figure it out yourself" energy
- No equivalent to Ansible Tower/AAP for non-technical operators who need a UI
Custom Python — The Full Spectrum
"Custom Python" is a spectrum, not a single choice. At one end, you have raw paramiko-based SSH scripting from 2010 that someone maintains out of fear. At the other end, you have a production-grade automation platform built on scrapli-async, Pydantic v2 models, FastAPI for an internal automation API, and a full test suite with pytest. Both are "custom Python." The question is how much infrastructure you're willing to build.
The key libraries:
Netmiko — Kirk Byers' library that makes SSH connections to network devices sane. Handles connection setup, prompt detection, command sending, and output capture across dozens of device types. Mature, reliable, broadly used. Think of it as the SSH swiss army knife.
NAPALM (Network Automation and Programmability Abstraction Layer with Multivendor support) — Provides a normalized interface across vendors. get_interfaces(), get_bgp_neighbors(), load_merge_candidate() work the same way whether you're talking to Arista EOS, Cisco IOS, or Juniper Junos. Critical for multi-vendor environments where you want abstraction.
Scrapli — The modern, async-native alternative to Netmiko. Built for performance. Supports asyncio, has a clean plugin model, and handles NETCONF natively. If you're building a new Python automation tool in 2026 and care about scale, start with scrapli.
PyATS/Genie — Cisco's test and automation framework. Heavy, Cisco-centric, but unmatched for structured output parsing (Genie's parsers cover hundreds of show command outputs) and stateful testing.
The argument for going fully custom: you own the entire stack. No framework assumptions. No version conflicts with Ansible's Python environment. No waiting for a collection maintainer to fix your bug. The cost: you build everything yourself, including the parts you probably shouldn't be spending time on.
SourcesNetwork to Code, Networklore, APNIC Blog, Kirk Byers (Netmiko)
The Practical Section
Decision Table
| Dimension | Ansible | Nornir | Custom Python |
|---|---|---|---|
| Team Python skill required | Low | Medium-High | High |
| Performance at scale (500+ devices) | Poor | Excellent | Excellent |
| Pre-built vendor modules | Extensive (but some broken) | Plugin ecosystem | You build or assemble |
| Complex conditional logic | Painful | Natural | Natural |
| Testing/validation | Awkward (Molecule) | Standard pytest | Standard pytest |
| Commercial support | Yes (AAP) | Community only | Community only |
| Scheduling + RBAC + UI | Yes (AAP) | You build | You build |
| Data transformation | Painful (Jinja2 DSL) | Easy (Python) | Easy (Python) |
| Integration with NetBox/Nautobot | Good (dynamic inventory) | Excellent (Python API) | Excellent (Python API) |
| Audit trail | Built into AAP | Build it | Build it |
| Event-driven/reactive patterns | Yes (EDA) | You build | You build |
When Each Tool Wins
Use Ansible when: You have a mixed-skill team and the automation needs are primarily configuration push and change execution. The declarative model maps cleanly to "push this template to these devices in this maintenance window." If you're in an Ansible Automation Platform shop, the scheduling, RBAC, and approval workflow are genuinely valuable for operational governance. EDA is a real advantage for reactive use cases — wiring Zabbix or Netbox change events to remediation playbooks without writing your own event loop.
One caveat for 2026: do not rely on the src-parameter template workflow in *_config modules until Red Hat explicitly fixes and announces the fix. If you're currently using this pattern, validate immediately whether your playbooks are actually applying templates. The bug reports success while silently skipping the template render.
Use Nornir when: Your team knows Python and you're operating at any meaningful scale. "Meaningful scale" in practice means: more than 50 devices where sequential execution is painful, or any workflow where you need complex data transformation, validation, or integration with other Python tools. The typical high-value pattern: use Nornir to gather state (via nornir-scrapli or nornir-napalm), transform results into Pydantic models, diff against NetBox/Nautobot as source of truth, generate structured reports or trigger remediations.
The hybrid pattern that the Network to Code community has been advocating — and that the April 14 coverage here confirmed is gaining real traction — is: Nornir for data collection and validation, Ansible for final configuration push. You get Nornir's Python flexibility and performance for the complex parts, Ansible's change execution model for the simple push at the end. This isn't compromise, it's layering.
Use custom Python when: You need something that frameworks don't give you, or you're building a product-grade automation platform that others will consume. Building an internal automation API? FastAPI plus scrapli-async plus Pydantic is a cleaner foundation than bolting Nornir or Ansible into a web service. Building network testing infrastructure? pyATS plus Batfish plus custom pytest fixtures. Building an automation library that other teams import? Pure Python with no framework assumptions.
The signal to go fully custom is usually one of two things: (1) you've outgrown what any framework assumes, or (2) you're building something that will become a product, not a set of scripts.
The Hybrid Architecture That Actually Works
Here is the reference pattern that handles roughly 80% of real-world network automation scenarios:
Layer 1 — Source of Truth: NetBox or Nautobot as the authoritative inventory. Nornir's inventory plugin pulls live data at runtime. No static YAML files that drift from reality.
Layer 2 — Validation: Nornir task runs read-only state collection against the network. Output parsed via NAPALM getters or Genie parsers. Results validated against Pydantic models. Diffs generated against expected state from source of truth.
Layer 3 — Pre-flight testing: Batfish analyzes proposed configuration changes for correctness before anything touches a device. This is the most important layer most teams skip.
Layer 4 — Change execution: Ansible playbooks push validated configurations. The declarative model works well here. Change records auto-generated. The execution layer is the one place Ansible's "task list" model maps cleanly to operational reality.
Layer 5 — Verification: Post-change Nornir run confirms state matches intent. Automated diff against pre-change snapshot. Human gets a structured report, not a wall of terminal output.
This is not a theoretical architecture. It's the pattern Ivan Pepelnjak, Network to Code, and teams like the one building Nautobot have been converging on. The tools each occupy the layer where they have structural advantages.
Migration Paths
From pure Ansible to hybrid: Identify the workflows where Ansible's limitations are actually costing you time. Usually it's data collection and transformation, not config push. Start by replacing Ansible facts gathering with Nornir. Keep the Ansible execution layer. Add Batfish for pre-flight validation. This is a three-month project for a motivated engineer, not a two-year migration.
From custom scripts to Nornir:
If you have a collection of Netmiko scripts, the migration to Nornir is straightforward. Nornir's nornir-netmiko plugin uses Netmiko under the hood. Wrap existing connection logic in Nornir tasks, move inventory to structured YAML or a NetBox plugin, add the thread pool runner. You get parallelism essentially for free.
From Ansible to EDA: If your team is invested in Ansible but wants reactive/event-driven patterns, EDA is the path of least resistance. The Zabbix EDA integration, NetBox webhook triggers, and Kafka source plugins are all production-ready as of early 2026. This extends Ansible without requiring a Python rewrite.
SourcesIvan Pepelnjak (ipSpace.net), Network to Code, NetBox Labs, Nautobot
Where It's Going
AI as the New Abstraction Layer
The tools above are the execution layer. What's changing rapidly is the intent layer above them. Ansible Lightspeed (generally available for Red Hat subscribers as of early 2026) and IBM watsonx integration let engineers describe what they want in plain language and get YAML playbook suggestions in VS Code. This is not magic — the generated YAML still needs review and testing — but it meaningfully lowers the barrier for less experienced engineers.
More interestingly, Nornir's Python-native model turns out to be a better substrate for AI-assisted automation than Ansible's DSL. When an LLM generates automation logic, that logic is usually Python. Plugging LLM-generated Python tasks into a Nornir runner, validating them with Pydantic, and running them through Batfish before execution is a coherent safety architecture. Plugging LLM-generated YAML into Ansible's config module pipeline — especially given the current breakage — is not.
The Itential FlowAI and Cisco Agentic Workflows for Meraki announcements earlier this month point to the same direction: natural language intent maps to a governed workflow execution engine, with the automation framework (Ansible, Nornir, or custom) as the implementation layer underneath. The abstraction is moving up, not down.
Nornir's Async Future
Nornir 3.x's plugin model was the necessary prerequisite for what comes next: truly async execution. nornir-scrapli already supports asyncio. Teams at scale are starting to build Nornir-based automation that runs hundreds of concurrent device connections without the overhead of Python threads. For very large inventories — think 2,000+ device service provider scale — async Nornir is meaningfully different from threaded Nornir, and the ecosystem is maturing to support it.
The Ansible Fork Question
The Ansible community fragmentation question is real. Ansible AWX (the open-source Tower), EDA (open source), and the community ansible-core are one track. Ansible Automation Platform is the Red Hat commercial product. The network module crisis documented by Pepelnjak suggests the open-source networking track is effectively in maintenance mode — maybe less. Teams building new network automation in 2026 should not assume ansible.netcommon will be a stable foundation without testing every release.
The Tools That Will Matter in 24 Months
The tools to watch:
- Nautobot's nornir integration (
nautobot-app-nornir) — direct pipeline from source of truth to Nornir execution. Note that thedispatcher_mappingis being removed in the next major version; migrate tonetwork_drivernow. - pyGNMI and gNMIc for model-driven telemetry and configuration via gNMI — the path forward for vendors with mature OpenConfig/YANG support. This bypasses both Ansible and Nornir's SSH assumptions entirely.
- Scrapli-async as a Netmiko replacement for teams building new automation at scale.
- Batfish as CI/CD infrastructure, not just a testing tool — continuous pre-change validation in every automation pipeline.
- LLM-assisted task generation in Nornir workflows — not as a replacement for engineers, but as an accelerator for boilerplate task writing. The pattern of "generate, review, test, commit" is already working for teams that have built the validation infrastructure.
SourcesRed Hat (Ansible Lightspeed), Itential, Cisco, Nautobot, Network to Code
Further Reading
-
"Has Ansible Team Abandoned Network Automation?" — Ivan Pepelnjak, ipSpace.net (December 2025) The canonical reference for the
srcparameter breakage and what it signals about Red Hat's investment in open-source network automation. Required reading before starting any new Ansible-based project. -
"Nornir vs. Ansible: How to Choose?" — NetBox Labs Blog Practical decision framework with specific selection criteria. Includes integration points with NetBox inventory for both tools.
-
"Ansible vs Nornir: How Python Makes Automation Easier" — CBT Nuggets Good explanation of the architectural differences for engineers who are stronger on networking than Python.
-
"Network Automation Tools Comparison — Ansible vs Terraform vs Python vs Nornir" — Networkershome Broader landscape view including where Terraform fits (it doesn't replace Ansible/Nornir for device-level automation, but the confusion is common).
-
"Ansible vs Nornir Speed Challenge" — Networklore The benchmark that quantified the roughly 100x performance differential. Methodology matters for interpreting the number, but the directional finding is robust.
-
"Automation Tools: Paramiko, Netmiko, NAPALM, Ansible, Nornir or...?" — APNIC Blog (2023) Still the clearest explanation of the full Python networking library stack from first principles. Slightly dated but foundational.
-
"Introduction to Event-Driven Ansible and Nautobot" — Network to Code Blog The practical EDA integration pattern that makes Ansible's reactive story concrete. Good if your team is Ansible-first and wants to extend rather than rewrite.
-
"How to Use Nornir as a Python Alternative to Ansible" — OneUptime Blog (March 2026) Recent walkthrough of migrating from Ansible thinking to Nornir thinking. Good for engineers making the transition.
Footer
Topic: Ansible vs Nornir vs custom Python — choosing the right automation layer Edition: Saturday Deep Dive — April 25, 2026 Searches used: 5 (+ 2 deep-fetch fetches) Quality score: 4/5 — strong research depth, concrete decision framework, actionable migration paths. Slight deduction: direct benchmark data for Nornir 3.x specifically in 2025-2026 was not available from primary sources; the 100x figure is community-reported and directionally reliable but not from a controlled recent study.
Get the briefing in your inbox.
One email per weekday morning. Same writing, same sources — no audio required.