You’d never make a change to your network without running it through ample checks first:
- Customer traffic must never bypass the inspection layer.
- EU data must stay in the EU.
- The audit data store must never be reachable from the guest network.
But are you really checking everything?
Even complex Python scripts that you continuously update, that rebuild your entire topology from scratch, have a blind spot: they only see the network as it exists the instant that script runs.
If someone adjusts a schema or adds a device minutes later, another team member’s change might have unintended consequences no one knows about – until you’re all stuck firefighting a midnight outage or dealing with a failed audit.
With Infrahub’s new Graph Traversal capability, you can cover all your bases with reachability checks that assess changes across your entire network in real time, automatically, while a proposed change is still in review.
Below, we explain:
- Why Infrahub Reachability Checks beat out hard-coded scripts
- How to write an automatic Reachability Check
- Applications for multiple network automation use cases
But first, a quick review of reachability checks.
What Is a Reachability Check?
For the purposes of this article, we’re defining reachability checks as:
Rules that assert that one part of your network can reach another only through approved points, never through forbidden ones.
Today, most teams enforce a rule like that in a Python script. It pulls the devices, interfaces, and BGP sessions, rebuilds the topology in memory, walks every source-to-destination path, and applies the rule. Essentially rebuilding your network graph in code, by hand.
But there are a few issues with this approach:
First, some rules don’t make it into code at all. They live in Slack threads, old Confluence diagrams, or the longest-tenured team member’s head.
Second, network paths are always changing, and your scripts have to change, too. Even if you do manage to codify every rule, keeping them enforced on every change – as engineers add devices, reroute traffic, and touch the topology daily – is really, really hard.
And it gets harder the bigger your team gets.
On a small team, the person making a change has mapped out the whole network they’re about to touch. In a large org, an engineer rerouting traffic or swapping an ASN on one team may have no idea they are about to break a path another team is responsible for. They don’t have the full map, let alone all the rules, in their head.
What Makes Infrahub Reachability Checks Different
Infrahub stores your network as a graph: all the devices, sessions, zones, and tenants are queryable objects. Any change to the modeled intent appears in the graph immediately, including a change still under review in a branch.
Our newest capability, Graph Traversal, traces every path between a source and destination and any dependencies between them.
Other tools can trace paths, too, but they surface those paths as a backend API for engineers to script against.
Infrahub puts a rule-authoring UI on top of a live graph. Which means any operator (not just a programmer) can define what ‘reachable’ means and have it enforced automatically.
And because Infrahub Reachability Checks run against the proposed change in its branch, the graph as it would look after merge, engineers see a violation their change would introduce before it reaches production, not after.
Built-In Permissions
Infrahub also lets you control who can define these rules in the first place.
- Automation specialists write scripted checks.
- Operators are granted a rule-authoring permission via Infrahub’s object-level RBAC. They define and refine Reachability Checks: which paths must hold, which transits are forbidden, and how deep to look.
- Network engineers can propose changes to the graph, but they cannot create, edit, or delete Reachability Checks.
Network operators guarantee that their rules are being followed, and engineers feel confident that their proposed changes won’t cause issues down the line.
How to Write an Automatic Reachability Check
If you’d rather watch than read, here’s a full walk through.
1. A Network Operator Authors an Automatic Reachability Check
Let’s say they want to assert that ATL1-edge1 can reach JFK1-edge1 via AS 64496, but never AS 8220.
In Infrahub, they’ll head to Reachability Checks > Add a Reachability Rule, then input:
- The source (ATL1-edge1)
- The destination (JFK1-edge1)
- Max depth (say, 3)
- Max paths (say, 50)
Then, they add their two constraints:
- Must transit: AS 64496
- Must never transit: AS 8220
As soon as they publish this Rule, Infrahub will automatically review every proposed change against it. No need for an engineer to memorize the rule or even know it exists.
And if they modify any part of the Reachability Check (Rules or Constraints), that update will take effect immediately. For example, here’s what happens when they tighten the maximum path depth from 3 to 1:
2. An Engineer Opens a Proposed Change
Reachability Checks run, tracing every path from source to destination and testing each of those paths against every Rule and Constraint in the Check.
3. The Change Passes or Fails
In the Checks tab of their proposed change, engineers see green for passed tests, red for failed tests. Here’s what it looks like when a proposed change passes:
In the case of a failed proposed change:
- The violation is flagged on the proposed change, so it can’t slip through unnoticed.
- Note: whether this hard blocks a merge or surfaces as a required failing check depends on your merge policy.
- The engineer can open the Reachability Rule in the UI to see exactly which Constraint their change violated (rules are visible to everyone, even though only operators can edit them).
Here’s an example:
A Whole New Suite of Checks
As you can imagine, Reachability Checks that assess every proposed change against every rule and your live graph are useful for solving a lot of common network engineering problems.
Here are just a few, some taken directly from engineers we talked to about network automation at AutoCon 5 this spring:
- Impact assessment. Before a change merges, you can see source-to-destination paths it might break.
- Dependency analysis. Map everything a node touches so when something goes down, you know exactly what’s affected.
- Security compliance. Prove every flow crosses the required inspection zone and never an unapproved bypass. If you define PCI scope, zero-trust boundaries, whatever your compliance framework demands, as rules, those get checked on every change.
- Capacity management. Check a change against the capacity and utilization you’ve modeled on each link, so a proposed path won’t be routed over one that’s already modeled at its limit.
- Path redundancy. Enforce that critical pairs always maintain two disjoint paths. If losing one link would leave you with no failover, you find out before the maintenance window.
- Hop-count bounds. Assert that a destination must be reachable within a specific hop budget along approved low-latency transit.
Have more to add? Let us know in the OpsMill community on Discord.
Where to Go Next
Explore our reachability schema repository, or:
- Watch a full deep dive of reachability checks on YouTube, courtesy of yours truly
- Read the Graph Traversal announcement to go deeper into how it works
- Read the Infrahub 1.10 release notes to see everything else we shipped in our latest release