What Is Leaf-Spine Architecture and How to Design It
May 23, 2017
For many years, data centers have been built in a three-tier architecture. But with the data center consolidation, virtualization, hyper-converged systems springing up, a new networking architecture, leaf-spine, gradually becomes the mainstream in today's data center network deployment. Then how much do you know about leaf-spine architecture? How to build leaf-spine architecture? We will explain what leaf-spine architecture is and how to design leaf-spine architecture.
What Is Traditional Three-Tier Architecture?
Traditional three-tier architecture consists of three layers in the deployment: core, aggregation/distribution and access layer. The switching devices in each layer are interconnected by pathways for redundancy which can create loops in the network.
In the past, the majority of data center traffic is from server to server, or from server to storage systems, which we consider as “east to west” traffic. The three-tier architecture model is typically designed for the east to west traffic, so the packet moves through three hops—it flows to the core, is routed to the aggregation layer switch, and then it is forwarded to the access switch where the end devices are connected. With the transformation of data center, it requires more data travel within the data center while the number of hops is increasing, which adds more possibility to packet loss and significant latency. So if running massive east-west traffic through this conventional architecture, devices connected to the same switch port may contend for bandwidth, resulting in poor response time obtained by end-users. Thus, this three-tier architecture is not suitable for the modern virtualized data center where compute and storage servers may be located anywhere within the facility.
What Is Spine-leaf Architecture?
With three tier gradually losing momentum in the modern data center, spine-leaf architecture comes to its place. As shown below, the leaf-spine design only consists of two layers: the leaf layer and the spine layer, which reduces the hops and guarantees reduced delay. This is the so-called “leaf-spine” architecture, where there are only two tiers of switches between the servers and the core network.
The spine layer is made up of switches that perform routing, working as the backbone of the network. The leaf layer involves an access switch that connects to endpoints like servers, storage devices. In a leaf-spine architecture, every leaf switch is interconnected with each spine switch. With this design, any server can communicate with any other server with no more than one interconnection switch path between any two leaf switches.
Advantages of Leaf-Spine Architecture
Leaf-spine architecture has become a popular data center architecture designed especially when data centers grew in scale with more switching tiers. The advantages of the leaf-spine model are the improved latency, reduced bottlenecks, expanded bandwidth and scalability.
Firstly, leaf-spine uses all interconnection links. In hyper-scale data centers, there might be hundreds or thousands of servers that are connected to a network. In this case, the leaf switch can be deployed as a bridge between the server and the core network. Each leaf connects to all spines with no interconnections among neither spines themselves nor leafs which creates a large non-blocking fabric. While in a three-tier network, one server may need to traverse a hierarchical path through two aggregation switches and one core switch to communicate with another switch, which adds latency and creates traffic bottlenecks.
Another advantage is the ease of adding additional hardware and capacity. Leaf-spine architectures can be either layer 2 or layer 3, thus leaf switch can be added to increase capacity and spine switch can be added as needed for uplinks, expanding the interlayer bandwidth and reducing the oversubscription.
How to Design Spine-leaf Architecture?
Before designing a leaf-spine architecture, you need to figure out some important related factors. In this aspect, oversubscription ratios, leaf and spine scale, uplinks from leaf to spine, built at layer 2 or layer 3 should be considered.
Oversubscription Ratios — Oversubscription is the ratio of contention when all devices send traffic at the same time. It can be measured in a north/south direction (traffic entering/leaving a data center) as well as east/west (traffic between devices in the data center). Current modern network designs have oversubscription ratios of 3:1 or less, which is measured as the ratio between the upstream bandwidth (to spine switches) and downstream capacity (to servers/storage).
The figure below illustrates how to measure the oversubscription ratio of leaf and spine layers. The leaf switch has 48× 10G ports, giving a total 480Gb/s of port capacity. If connecting the 4× 40G uplink ports of each leaf switch to the 40G spine switches so the leaf will have a total 160Gb/s uplink capacity. That’s how the ratio comes— 480: 160, and we get the 3:1 ratio.
Leaf and Spine Scale — As the endpoints in the network connection only to the leaf switches, the number of leaf switches in the network depends on the interface number required to connect all the endpoints including multihomed endpoints. Because each leaf switch connects to all spines, the port density on the spine switch determines the maximum number of leaf switches in the topology. And the number of spine switches in the network is governed by a combination of the throughput required between the leaf switches, the number of redundant/ECMP (equal-cost multi-path) paths between the leafs, and the port density in the spine switches.
40G/100G Uplinks from Leaf to Spine — For a leaf-spine network, the uplinks from leaf to the spine are typically 40G or 100G and can migrate over time from a starting point of 40G (Nx 40G) to become 100G (Nx 100G). An ideal scenario always has the uplinks operating at a faster speed than downlinks in order to ensure there isn’t any blocking due to micro-bursts of one host bursting at line-rate.
Layer 2 or Layer 3 — Two-tier leaf-spine networks can be built at either layer 2 (VLAN everywhere) or layer 3 (subnets). Layer 2 designs provide the most flexibility allowing VLANs to span everywhere and MAC addresses to migrating anywhere. Layer 3 designs provide the fastest convergence times and the largest scale with fan-out with ECMP supporting up to 32 or more active spine switches.