In the past, network designs were typically built around large Ethernet Layer 2 domains spanning across the whole data centre, whilst external connectivity was provided by Layer 3 routers, firewalls, load balancers etc. attached to the Layer 2 domains. These designs, however, were deployed in a time where there was not that much traffic on the network and the amount of connected devices was much smaller than today. The immense growth in both bandwidth demand and need for resiliency now brings traditional network designs close to its limits, so Ethernet VPN Virtual Extensible LAN (EVPN-VXLAN) architectures come into play.
Having talked about the drivers and business relations of modern data centres in our recent blog post focussing on hardware and software in modern data centres, in this article we will have a look at ways to modernize existing data centre infrastructure without the need to transform everything from the scratch! It is the start of a multi-part series diving into a technical description of possible use cases and underlying technology. Further entries will show approaches which may need more radical change to existing environments.
Where we come from: The traditional Layer 2 architecture
Most of the traditional Layer 2 networks are built according to a multi-tier architecture including Border, Core, Distribution, and Access, and use legacy protocols like Spanning Tree or similar in order to work around Ethernet limitations to create loop free environments.
Ethernet was invented as a single large tree or bus without any redundancy in mind. Over time as the network became increasingly important with more workarounds and optimisations required to the traditional Ethernet architecture. In addition, Layer2 domains kept growing so the protocols today are under tremendous load and scale that was never meant to be serviced by them.
Consequently, there are a couple of typical challenges, some of which you probably have already recognised:
- Half of the capacity is unused as links and devices are not actively used due to e.g. Spanning Tree.
- All Layer 2 end systems and their MAC addresses are reachable everywhere therefore all switches need to know all MAC addresses.
- Lots of background traffic due flooding for maintenance protocols like ARP and similar is polluting the network.
- If vLANs are used for Layer 2 segmentation, they are often spanned across the complete network as nobody knows where they might be needed.
Rethinking the data centre architecture towards an EVPN-VXLAN overlay
An approach to mitigate these problems is rethinking the data centre architecture and dividing it into two main areas:
- An Underlay, which main purpose is to transport data from A to B in a redundant and scalable way without and responsibility for any connectivity towards end devices.
- An Overlay, which will take care of mimicking the well know Layer2 Ethernet behaviour and at the same time introduce enhancements and optimisations for traffic flows.
By separating the forwarding (Underlay) from the actual service and learning (Overlay) we can scale and enhance each layer independently from the other. Changes and optimisations in the Underlay will not directly impact the way that the Overlay is working and vice versa.
The physical setup for the underlay is changed into what resembles a three stage folded Clos fabric. The network is built on Spine and Leaf devices which are fixed commodity Switches with routing functionality.
For further information on the inner workings of a clos fabric visit: https://en.wikipedia.org/wiki/Clos_network
The Underlay uses routing protocols (most commonly BGP) in order to create a loop free, all active and high speed network fabric so that all Leafs can communicate with each other as fast as possible.
Within the Overlay, EVPN-VXLAN is used in order to enable Layer 2 frames to travel across the high speed routed underlay. The VXLAN encapsulation has the role of transporting the Layer 2 Ethernet Frames across the routed fabric, EVPN takes care of learning and distributing the knowledge of end devices locations and connectivity.
End devices connected to the leaf nodes of the network will now be able to transparently use the new data centre fabric the same way that they were using a traditional Layer 2 Ethernet topology. This is also true for all existing network appliances (firewalls, load balancers) and routers.
For further information on EVPN and VXLAN visit the following webpages of our partners Juniper Networks and Arista.
The benefits of the EVPN-VXLAN Underlay/Overlay approach
This mode of EVPN-VXLAN is called “Bridged Overlay” as we are providing the well know Layer 2 bridging services using an overlay running on an all active and routed underlay network.
In general, the new Underlay/Overlay approach provides the following benefits:
- All active underlay network topology; all interfaces, links and devices are used actively during normal operation (Clos Fabric).
- Selective service rollout; Layer 2 reachability is given where it is needed (EVPN-VXLAN).
- Flooding of traffic is greatly reduced as the network is actively learning and distributing knowledge about end systems (EVPN-VXLAN).
- Further reduction of background traffic by e.g. ARP optimization inside the overlay (EVPN-VXLAN).
This setup can serve as a drop in replacement for existing networks as there are no changes in behaviour for any end device connected to the fabric while providing the option to rollout additional enhancements in the future.
Meeting challenges with pre-tested and standardised EVPN-VXLAN solutions
Rolling out a new architecture like this, however, introduces technologies and protocols into the data centre space that were originally not used here. Thus, the setup itself might seem complex and hard to understand as it involves multiple layers of technologies.
This is where Xantaro’s “Next-Gen Data Center Solutions“ come into play: A set of best practice designs and configurations to provide the best possible network fabric for the specific use cases! The different solutions are compiled into blueprints and continuously enhanced following a continuous improvement methodology making use of an automatic testing framework.
This means, after our Engineering has worked out a specific network design for the EVPN-VXLAN overlay fabric, comprehensive testing to verify it in real world scenarios will be started. A replica of a typical spine and leaf network is created in Xantaro’s XT³Lab, the configuration engineered for the given fabric type is rolled out and finally, the XFAST framework runs pre-defined test cases to verify and qualify the setup.
In case of the bridged overlay setup described in this article the test suite consists of about 450 test cases performing 2500 individual test steps today. Each test run takes about four hours to complete and provide a verbose test report of about 500 pages. By comparison: If an engineer would perform all of this tests by hand it would take around five working days to finish and document everything!
As the automated testing does not need to take any breaks, we can run multiple tests without any interruption which allows up to six complete test runs per day and up to 30 tests runs in five days. This in turn ensures that the result of our testing is deterministic and helps us to eliminate the likeliness of errors or sporadic failures.
New test cases and test steps are continuously added to the test suite. This additions are either based on Xantaro internal Engineering decisions, customer input or could be triggered by any Real World Problems encountered in production networks. By adding these tests we reduce the likeliness of any reoccurrence of these problems.
Whenever new hard- or software needs to be qualified, we rerun the tests and ensure that everything is still working according to the expected outcomes, before any changes are rolled out to any production network.
For further information on the benefits of test automation also read: https://www.xantaro.net/en/tech-blogs/advantages-of-software-update-automation
Ergo: old becomes new!
Xantaro’s Next Gen Data Centre Solutions provide an easy and straight forward way to solve challenges of your existing legacy Layer 2 network and can serve as a drop in replacement.
- The blueprints are engineered and qualified to allow you access to seemingly complex technologies without the need for any extensive in-house engineering.
- Automated testing and pre-qualification uncovers any problems before they impact the production network.
Share your opinion with us!
Your perspective counts! Leave a comment on our blog article and let us know what you think.