Skip to content

Blog#

SR Linux CLI: Wildcards and Ranges

The SR Linux Command Line Interface (CLI) stands out as one of the most advanced and user-friendly CLI systems I've encountered. It breaks away from the conventional "industry standard" and introduces several innovative concepts that greatly enhance the ease of configuring and managing the network operating system. Among these innovations are "CLI wildcards and ranges," which, once mastered, can significantly improve your overall experience and efficiency.

The idea of using wildcards and ranges is not novel, as some CLI systems already include support for them. Nevertheless, SR Linux takes the concept of ranges and wildcards one step further, and in this post, we will explore how to harness their power effectively.

Exposing Kubernetes Services to SR Linux-based IP Fabric with Anycast Gateway and MetalLB

In the era of applications, it is easy to forget about the underlying infrastructure that interconnects them. However, the network is still the foundation of any application as it provides the connectivity and services that applications rely on.

The most popular container orchestration system - Kubernetes - is no exception to this rule where infrastructure is essential for several reasons:

  1. DC fabric: Almost every k8s cluster leverages a DC fabric underneath to interconnect worker nodes.
  2. Communication Between Services: Kubernetes applications are often composed of multiple microservices that need to communicate with each other. A well-designed network infrastructure ensures reliable and efficient communication between these services, contributing to overall application performance.
  3. Load Balancing: Kubernetes distributes incoming traffic across multiple instances of an application for improved availability and responsiveness. A robust network setup provides load balancing capabilities, preventing overload on specific instances and maintaining a smooth user experience.
  4. Scalability and Resilience: Kubernetes is renowned for scaling applications up or down based on demand. A resilient network infrastructure supports this scalability by efficiently routing traffic and maintaining service availability even during high traffic periods.

Getting familiar with all these features is vital for any network engineer working with a fabric supporting a k8s cluster. Wouldn't it be great to have a way to get into all of this without the need of a physical lab?

In this blog post we will dive into a lab topology that serves as a virtual environment to test the integration of a Kubernetes cluster with an IP fabric. The emulated fabric topology consists of a SR Linux-based Clos fabric with the Kubernetes cluster nodes connected to it. The k8s cluster features a MetalLB load-balancer that unlocks the capability of announcing deployed services to the IP fabric.

Throughout the lab, we will explore the way k8s services are announced to the IP fabric, and how L3 EVPN service with Anycast Gateway can be leveraged to create a simple and efficient overlay network for external users of the k8s services.

As for the tooling used to bring up the lab we will use Minikube to deploy a personal virtual k8s cluster and Containerlab will handle the IP fabric emulation and the connection between both environments.

Finding misconfigurations in your fabric using pyGNMI

Git Repo

Today, I'm sharing another piece of my experience from the NANOG88 conference where I had the privilege of presenting a tutorial featuring pyGNMI, a powerful tool for diagnosing network issues. During my talk, I used pyGNMI to visualize EVPN Layer2 and Layer 3 domains, sorting them by switch or network instance. I also added a special feature that detects discrepancies in the settings between different switches using the same EVPN domain – a great way to catch typos in your BGP/VXLAN settings.

Note

The script demonstrates how to use pyGNMI to retrieve BGP EVPN information from a list of routers. It then formats the data for easy viewing.
For real world use cases, you would likely wrap pyGNMI with Nornir and leverage Nornir's inventory and task management capabilities, like shown here.

For this demonstration, I leveraged containerlab and Nokia SR Linux to build a VXLAN-EVPN Fabric, replicating a typical configuration I often use in my Kubernetes labs. I incorporated eBGP for underlay communication, and the topology I utilized comprised two spines, two leaf switches, and a border leaf.

In this blog post we are going to dive into the details of the script, discovering how it works and what it is capable of. If you want to try it out yourself, you can find the source code in the pygnmi-srl-nanog88 repo

Intent-based fabric management with Ansible

Tutorial: Intent-based management with Ansible

Ansible is today the lingua franca for many network engineers to automate the configuration of network devices. Due to its simplicity and low entry barrier, it is a popular choice for network automation that features modular and reusable automation tasks available to network teams.

Broadly speaking, there are two common approaches to network automation with Ansible:

  1. Smaller, per-device configuration management using Ansible modules
  2. And a more broad and generic, per-service/role (or even per-fabric) configuration management using higher-level Ansible abstractions like roles and custom modules.

The first approach is the most common and straightforward one, as it is easy to get started with and requires little to no development skills. Just take the off-the-shelf module provided by the Ansible community or a vendor and start moving configuration tasks from the CLI snippets saved in a notebook to a playbook.
While sounding simple, this approach can become a maintenance nightmare as the number of devices and configuration tasks grows. The playbook will become a long list of tasks that are hard to maintain and reuse.

This is when the second approach comes into play. It requires a deeper understanding of Ansible concepts, but it is more scalable and maintainable in the long run. The idea is to abstract the configuration tasks into reusable Ansible roles and use variables to pass the configuration parameters to the roles. This way, the playbook becomes a list of roles that are applied to the devices in the inventory.

When roles are designed in a way that make services provisioned on all the devices in the inventory, the playbook becomes an intent-based service provisioning tool. To provide a practical example of using Ansible to manage the configuration of an SR Linux fabric with the intent-based approach leveraging the official Ansible collection for SR Linux we created a comprehensive tutorial that covers A to Z the steps required to start managing a fabric in that way - Intent-based management with Ansible tutorial.

We are eager to hear your thoughts on that approach and the tutorial itself. Please drop a comment below or open an issue in the GitHub repository if you have any questions or remarks.

Single Tier Datacenters - Evolving Away From Multi-chassis LAG

Multi-Chassis LAG (MC-LAG) was a welcome technology that helped enterprises move away from xSTP based L2 networks. It solved many of the issues inherent to xSTP networks, like underutilized links, long convergence times, and layer 2 loops. It became a common design pattern in many datacenters at the access and aggregate layers.

And as with any other technology, MC-LAG started to show its deficiencies as the datacenter networks continued to evolve. The need for a more scalable, interoperable and simpler solution led to the development of EVPN Multihoming (EVPN-MH) design. In this blog post we will discuss how a small-scale, single-rack datacenter deployment can benefit from EVPN-Multihoming-based design.
We finish the post by introducing a path to scale up from a single-rack deployment to a multi-rack deployment.

NANOG88: gNMI and ChatGPT To Troubleshoot EVPN Datacenter Fabrics

NANOG88

I am happy that during NANOG88 conference, I had the privilege of presenting a tutorial that I am now pleased to inform you is available for viewing by the wider audience on YouTube.

NANOG88 provided a remarkable platform for knowledge exchange and collaboration among esteemed professionals in the field of networking and Internet operations. I was fortunate to have the opportunity to contribute to this extraordinary event by delivering this tutorial.

Are you interested in learning about EVPN-VXLAN technologies for Datacenters and creating a virtual network lab using containerlab? Then check this tutorial where we will guide you through the process.

We will cover everything from installing the necessary requirements for Python scripting using libraries like pyGNMI, a powerful tool used for operating and troubleshooting network elements with access via gRPC, and with the help of ChatGPT. We will show how to configure many network elements at once using Go Templates and gNMIc. By the end of the tutorial, you will have tips and tricks to perform various network automation tasks in your network datacenter environment and troubleshoot any issues that arise.

This tutorial is suitable for both experienced network engineers and beginners who want to enhance their knowledge of network design and operation with tools like GNMI, python and ChatGPT. The Information is available at Repo

Don't miss out on this opportunity to improve your network engineering abilities and take your skills to the next level.

Participants: Mauricio (Mau) Rojas

gNxI Browser - A documentation UI for Openconfig gRPC services

In the past year, there has been a lot of buzz around gRPC and Openconfig services. Network engineers started to hear more g-acronyms: gNMI, gNOI, gRIBI. The bravest ones started to play with them, and those who like to live on the edge even started to use them in production. But the majority of network engineers are still not familiar with these technologies. The lack of tools to explore and understand these new technologies is one of the reasons for this.

You probably know that in srl-labs we strive for quality tools, and we are not afraid to build them ourselves. The famous gnmic, gnoic, gribic by Karim Radhouani are stellar examples of our effort to make gRPC and Openconfig services more accessible to network engineers.

Today we are happy to announce another initiative by our team - gnxi.srlinux.dev - a documentation UI for Openconfig gRPC services. It is a simple web application that allows you to explore Openconfig gRPC services and their protobuf definitions.

We hope that it will help network engineers to get familiar with gRPC and Openconfig services and we wanted to tell you how we built it.

Official Ansible collection for SR Linux

Ever since we released a tutorial that showed how to use Ansible's URI module with SR Linux, we couldn't shake off the feeling that we would need to do more with Ansible. And we did.
We are happy to announce that we have released an official Ansible collection for SR Linux - nokia.srlinux - that has four modules inside and leverages JSON-RPC interface.

In this blog post, we would like to share some details about our design decisions and why we think this collection is a great addition to the Ansible ecosystem.

Event-Driven Automation With Nokia’s SR Linux Event Handler Framework

Packet Pushers

In this Tech Bytes podcast we talk about Event Handler, a new automation feature in Nokia’s SR Linux network OS that lets you automatically run scripts to fix problems when an event occurs.

We discuss:

  • SR Linux’s modular design
  • The new Event Handler Framework
  • How Event Handler works
  • Use cases for network engineers and operations teams

Participants: Roman Dodin

SR Linux logging with ELK

Join the discussion: LinkedIn post · Twitter thread

In a not-so-distant past, manually extracting, parsing, and reading log files produced by network elements was standard practice for a sysadmin. With arcane piping of old-but-good grep, awk, and sed tools, one could swiftly identify a problem in a relatively large system. This was a viable approach for quite some time, but it became prey to a massive scale.

Today's network infrastructures often count thousands of elements, each emitting log messages. Getting through a log collection of this size with CLI tools designed decades ago might not be the best tactic. As well as correlating logs between network elements and application logs might be impossible without software solutions built with such use cases in mind.

The unprecedented growth in the application world boosted the development of multi-purposed centralized/cloud data collectors that make observability and discovery over huge data sets a reality. Elasticsearch / Logstash / Kibana (or ELK for short) is one of the most known open-source stacks tailored for the collection and processing of various documents, logs included.

To enable the processing of captured logs and deliver performant and robust search analytics log collectors rely on structured data. Unfortunately, the networking world is infamous for iterating slowly. For example, an outdated and informational Syslog interface still dominates the networking space when it comes to managing and transferring logs. Syslog RFC31644 was not designed to allow extensible structured payloads, which adds a fair share of problems with integrating such systems with modern log collectors.

This post explains how an SR Linux-powered DC fabric can be integrated with a modern logging infrastructure based on the Elasticsearch / Logstash / Kibana stack to collect, transform, handle, and view logs.