When qualifying new HW platforms or software releases in our XT³Lab facilities we typically go through a very long list of individual test steps for the device under test (DUT). Just recently we performed qualifications for the PTX10001-36MR which included dozens of individual tests and test scenarios. One of these is called “BGP Control Plane / Data Plane Convergence” (CP/DP Convergence).
In this article we focus on this specific scenario and details of CP/DP testing. We also present the results of other devices in Germany’s largest Juniper technology and test lab, which we ran the CP/DP convergence tests for as well to increase comparability and get a better feeling about the relevance of the PTX results.
Processing of BGP full tables on the control plane/data plane of modern routers
When running the Border Gateway Protocol (BGP) in the global internet for example for a service provider network, routers must handle so called BGP full tables. A BGP full table covers all existing internet routes and destinations. According to https://bgp.potaroo.net/ as well as to the BGP full tables received in the XT³Lab from partners and carriers, today an IPv4 Internet full table can contain up to +/- 920.000 entries and an IPv6 full table up to 155.000 entries.
A router keeps all of these entries in memory on the control plane. In order to forward packets without any additional interaction of the control plane, only the best path for each destination is selected and installed in the data plane. This process is very similar for each modern hardware-based routing platform and happens constantly as the internet is always in motion and routes are announced, withdrawn and/or updated all the time.
CP/DP Convergence – why does this matter?
If a router receives BGP full tables from multiple sources, it needs to store all copies in the control plane, make a best path decision and update/install only the best path into the data plane. This is not a huge problem during regular operation as long as only a handful of prefixes have to be changed at any given time.
But what if a router that has previously received BGP full tables from two upstream peers while suddenly one of the peers fails and withdraws all 920k IPv4 and 155k IPv6 routes. In this case, how long would it take to process all these updates and program the changes into the data plane? This is exactly what the CP/DP Convergence tests try to measure!
CP/DP Convergence Testing Methodology
Before diving into the details lets clarify a couple of things:
- The test described below are artificial tests and do not represent a “real” full table in terms of size and distribution of prefix length.
- It is a unidimensional test only and does not consider effects of e.g., BGP policy or other events happening on the router at the same time.
- There are features and config options to ensure faster convergence for specific failure scenarios or setups. This test specifically focusses on the time it takes for both the control and data plane to converge without any optimizations or helps from other protocols. We are also not cutting or disabling any physical interfaces which would lead to immediate packet loss.
- The measured times seen later can be considered the “best” possible outcome for updating the data plane in this test scenario.
1. Lab Setup
To perform the CP/DP convergence tests covered in this article we will utilise Keysight IXIA devices. The Device under Test (DUT) will be placed in the middle and will be connected to two simulated BGP carriers on the right side. Each of the carriers will announce the same BGP full table (900k IPv4 /24 routes) towards the DUT. The DUT will be configured (via BGP policy using local-pref) to prefer the routes of IXIA Carrier1 over the routes of IXIA Carrier2. A simulated customer router called IXA Cust on the left of the DUT will send a single IP prefix (IPv4 /24).
2. Test Baseline
Now traffic will be sent from the customer /24 IP Prefix towards each of the carriers 900k /24 IP prefixes (using one IP of each prefix). This results in 900k unique IP traffic flows. We will ensure that the data rates (especially packet per second rates) are high enough that each prefix at least will get traffic once per second (preferably even more often).
As the routes of Carrier1 will be preferred, all traffic will be sent towards Carrier1 during normal operation. This will be the baseline of our test in which we are not expecting any packet loss.
3. BGP Withdraw
Using the predefined IXIA test capability, we now will withdraw all 900k IPV4 routes from Carrier1. This is done using appropriate BGP messages assuming the DUT will need to
- process the withdraws on its control plane
- make new best path decisions towards Carrier2
- update the data plane to forward traffic towards Carrier2
While the device will be converging some traffic will still reach Carrier1 as the data plane entries will not yet be updated while some traffic will reach Carrier 2 already as some data plane entries will already have been updated. The DUT will be considered to have fully converged once all traffic is being received by Carrier2 while no traffic is reaching Carrier1 anymore.
Overall, we do not expect any end-to-end packet loss as we are not cutting any interfaces and should not lose any in flight packages.
The time it takes from the moment of the control plane event on Carrier1 (withdraw of all routes) to the moment all traffic being is received at Carrier2 is what we report the CP/DP Convergence time for a DUT in direction 1.
4. Recovery
We will now perform a recovery test. This time the simulated Carrier1 starts re-announcing the 900k IPv4 prefixes towards the DUT. As the DUT is configured to prefer all routes towards Carrier1 it immediately will start to shift traffic back to Carrier1.
The time it takes from the moment the control plane event on Carrier1 (re-announcing of all routes) to the moment all traffic is being received on Carrier 1 is the time we report as CPDP Convergence time for a DUT in direction 2.
This second tests typically take slightly longer than the first test setup as the BGP announcements must be parsed and processed for the new path options to be useable, while in the initial direction there was already a pre-learned alternative that could be used immediately.
How did we test and measure?
We decided that each DUT will undergo at least three full runs of tests. Each test run included one event of withdrawing all routes on Carrier1 (see direction 1 above) and one event of re-announcing the routes from Carrier1 (see direction 2 above).
We calculated the mean time for each direction across all the three runs: mean time of withdraw1, withdraw2, withdraw3 and mean time of re-announce 1, re-announce 2, re-announce 3. To make it a bit easier, we than also calculated the mean time across both test directions (withdraw and re-announcement) to have a single mean convergence time for all devices.
As we are sending 900k prefixes we could also calculate the pfx/s rate per DUT by dividing the number of prefixes by the convergence time. This gives an overall number of how many routes can be processed and updated per timeframe allowing to estimate the convergence times for smaller and larger routing tables.
CP/DP Convergence test results of various Juniper devices
Running Germany’s biggest Juniper Networks Technology & Test Lab provides the benefit of having almost the whole Juniper portfolio of devices available that are able to hold BGP full tables. Accordingly, all our DUTs are spread across a wide range of devices and families*:
- SRX Series (SRX345, SRX380, SRX1500, SRX4100, SRX5600 (IOC3)
- QFX Series (QFX10002-36Q, QFX10002-60C)
- PTX Series (PTX10001-36MR, PTX10003-80C)
- MX Series (MX80, MX104, MX204, MX10003, MX480 (MPC5/MPC7) MX10008 (LC2101)
- ACX Series (ACX7100-48L)
* Note: We are well aware that some of the devices used and tested are not officially meant to be run in a BGP full tables role e.g., the branch SRX devices. But we included them in this testing mainly because they were available and because some networks tend to use devices in roles not officially supported.
The test results of the systems are summarized in the following graphs for direct comparison. Overall, we consider the following facts worth to be highlighted:
- The fastest convergence times are seen in the x86-routing-engine-based MX series running Juniper Trio series ASICs.
- Closely following are the QFX/PTX devices running the Juniper Express series ASICs.
- The slowest convergence times are performed by software-based platforms (like SRX345/SRX380) and MX series devices based on PPC routing engines (MX80/MX104).
- The mid-range x86-based SRX1500 is somewhere in the middle of slowest to fastest.
- An interesting note is that the x86-based SX4100 is in the same region as the Juniper Trio- and Express-based devices mentioned above.
Reality-Check: What do these times mean?
Since are performing CP/DP tests in the XT³Lab for a couple of years and have qualified some of the platforms already when running earlier software versions, we can state that they could improve dramatically over time and will probably continue to improve with future JUNOS release.
As an example, we have a couple of older measurements from Juniper MX480 using MPC5 collected over the years from which you can see the increase in pfx/s over JUNOS versions.
One thing to note here is that all measurements marked in blue have been taken with a smaller full table size of 700k IPv4 routes. The green measurement is the one being taken as part of this article. As we are calculating the means and break it down into pfx/s they measurements should still be comparable. So, what we can see here is that optimization in software (running in the same hardware) can greatly help to speed up convergence.
Is this the only relevant number for convergence? Obviously, it is not. As mentioned above this is just one of many tests, we perform during platform qualification and it only tests for unidirectional scaling. Furthermore, it does not take into account any additional optimizations (e.g., BGP PIC Edge) for specific scenarios.
But as all devices had to perform the same tasks, it still gives us an indication about the general behaviour of a specific platform or series and the performance to expect from these devices – some details that can be very helpful in selecting the right system per role in your network.
CP/DP Convergence Testing – what’s next?
As there will be new devices being release constantly, we will update or follow up on this article once they become available in the XT³Lab. We are planning to perform the same Tests on MX304 and some members of the ACX7000 family! So, stay tuned!
To find out more about testing the PTX10001-36MR go to Juniper Networks PTX10001 tested as a Peering / Edge Router.
Share your opinion with us!
Your perspective counts! Leave a comment on our blog article and let us know what you think.