Aug 18, 2014

Growing pains of the Internet global routing table

INAP

Network over North AmericaRecently, some businesses experienced outages as a result of older routers hitting the default 512k routing table limit. Here at Internap, we have long been aware of “the TCAM problem” and have taken steps to prepare for it, but many companies are now getting caught off guard. As the global routing table continues to grow, there will likely be an increase in routing instability over the next few months/years, and smaller enterprises could learn some very painful lessons.

If a company is humming along with a BGP routing table of 500,000 routes from its Internet provider, then all of a sudden a Tier1 provider adds 15,000 routes to the table, they are now pushed over the 512,000 route limit and everything goes sideways. I expect to see a lot of that happening as we hit the 512,000 threshold; today we are at about 500,000 routes in the global table, which grows by about 1,000 a week on average. (The larger Tier1 providers such as Verizon, AT&T, Level3, etc. largely know about and have planned for this issue. I would be surprised if they experience any impact.)

The Cisco 6500 and 7600 router platforms are some of the most common pieces of network hardware out there — literally one of the most widely-deployed pieces of hardware on the Internet. (At Internap, we are actively replacing them with more-scalable platforms, expected to finish up in the next few quarters.) Older hardware platforms also have this same limited-memory problem, but for the most part, those platforms have been EOL’ed years ago by hardware vendors such as Cisco, Juniper and Brocade, so anyone still using them for full BGP tables is living dangerously against their vendor’s recommendation. However, the 6500/7600 are not EOL’ed and continue to be a core part of Cisco’s revenue stream, so this is a very real problem for a lot of companies.

Internap lands all of our upstream NSPs on newer-generation Cisco ASR1000 and Cisco ASR9000 platforms, which are built to scale to the much larger routing tables of the future, so we are not too worried. One of the reasons that we purchased these next-gen routers to land our upstream NSPs is our Managed Internet Route OptimizerTM (MIRO) technology. MIRO requires a full routing table from each NSP on the router, which uses up TCAM very quickly. Most enterprise/SMB companies out there are not landing multiple providers on a single router like we are doing; most of our markets have 10-12 providers spread across 3-4 cores, so we were forced to confront TCAM limitations some time ago.

On the 6500/7600 platforms, the previous generation supervisor module (the “SUP2,” which was EOL’ed a few years ago) can only hold 512k routes total, so as that tipping point is reached, lots of companies are going to need emergency hardware upgrades, or they will have to take less than a full BGP table from their provider. Taking less than a full table from the upstream provider is impactful to how granular a company can control their routing, and how much insight they have into what’s going on with the full Internet table, which is definitely a step backwards. Most companies will choose to upgrade the hardware instead, in my opinion.

The current generation 6500/7600 supervisor modules (the “SUP720” module on the 6500 and the “RSP720” module on the 7600) that are widely deployed on millions of production chassis can hold 1,024,000 routes total. The default settings for memory allocation on those modules are 512k IPv4 routes and 256k IPv6 routes (since an IPv6 route takes up twice as much memory as an IPv4 route). While the supervisor modules can hold more than 512k IPv4 routes, a lot of companies are going to learn The Hard Way that they have not manually re-allocated the memory to accommodate the ever-growing routing table. You have to make a config change and reload the router entirely, which is painful to rollout across a global footprint, and you might not even know you need to do it.

At Internap, we have retuned our remaining 6500s for 800k IPv4 and 100k IPv6 routes, which should last us over the next 2-3 years while we phase out our Cisco 6500s and 7600s. We did this specifically to address routing table growth. Currently, we are auditing all of our MCPEs (Managed Customer Premise Equipment) since those are much smaller hardware platforms with less memory available, to make sure there are no issues.

Over the next few years, millions of chassis will hit their physical limits. But you can’t just upgrade the supervisor module to the latest and greatest to get a few more years of runway — the entire chassis has to be replaced. The next-gen supervisor module for Cisco’s 6500/7600 platform that started shipping last year (the “SUP2T”) has the exact same limit of 1,024,000 routing table entries, which means if you are using the 6500/7600 platform, you have to replace the whole chassis with a next-gen model like the Cisco ASR9000, Juniper MX, Brocade MLX, etc.

The only other option is to take a partial/default-only BGP route. This graph of BGP table growth should be very scary for someone running hardware with a 1M route limitation.

Compounding this problem, the American Registry for Internet Numbers (ARIN) (the regional authority for North America that hands out IP addresses) continues to run out of IPv4 space. (Update: The IPv4 pool was officially depleted as of September 24th, 2015.)

Right now, the “BGP boundary” for a route in the global routing table is /24, meaning that the global routing table only has routes /24 or larger in it, specifically to keep the size of the routing table down to accommodate hardware limitations. The purpose of this limit has been to control de-aggregation of the routing table, because the vast majority of hardware deployed today can’t really support the routing table blowing up any larger than it already is. However, ARIN — trying to squeeze as much lifespan out of its remaining IPv4 allocations as possible — has started giving out smaller and smaller blocks and asking providers to route smaller allocations. Just last week, ARIN conducted a test where they tried to route /27s in the global routing table to see which providers might or might not be able to route blocks smaller than the /24 boundary. That is further indication that ARIN and the network operator community want to continue to de-aggregate the remaining address pool and prevent IPv4 exhaustion for as long as they can, but this will be incredibly problematic for everyone because it balloons the routing table and brings hardware limitations to the forefront.

By squeezing as much life out of the remaining IPv4 pool, network operators can delay the migration to IPv6. Routing IPv6 packets is well-supported within most hardware these days, but we find customers struggling with all of the ancillary things that have to happen — retraining their NOC, rebuilding their management, monitoring, and troubleshooting tools to speak both IPv4 and IPv6, developing IPv6 operational experience and so forth. Routing IPv6 packets is the easy part; all the other stuff that goes along with supporting IPv6 can scare off less-experienced customers. At that point, “just let the routing table get a little bigger” seems like an easy fix to avoid making wholesale migrations to IPv6 which might require new hardware, new tools and some operational struggles. Network operators will always put off large-scale technology leaps in favor of having more time to fight today’s fires, but that will not last forever.

So, back to ARIN. Breaking a /24 in half gets you two /25s, or four /26s, or eight /27s… imagine if a plurality of companies out there took all their /24s and started de-aggregating down to the /27 level, causing an 8x increase in their portions of the routing table. This will be a nightmare for most everyone, and a financial windfall for hardware manufacturers.

One of the most basic lessons from The Art of War is not to fight a war on two fronts simultaneously, but that is exactly what’s happening. On one hand, companies don’t want the headache of fully migrating to IPv6, so they’re encouraging ARIN to de-aggregate the routing table and squeeze as much IPv4 out of the remaining allocations as possible, which is inflating the routing table. On the other hand, massive wholescale hardware upgrades will be upon us in the near future, and companies must be ready to fight that battle when the time comes.

Explore HorizonIQ
Bare Metal

LEARN MORE

About Author

INAP

Read More