Figure 3 - uploaded by Jeffrey Mogul
Content may be subject to copyright.
VLANs covering 7-switch topology of Fig. 2 the topology; 2-D HyperX (k) [9], where k is the number of switches in each dimension of a 2-D mesh; and CiscoDC, Cisco's recommended data center network [12], a three-layer tree with two core switches, and with parameters (m, a) where m is the number of aggregation modules, and a the number of access switch pairs associated with each aggregation module.

VLANs covering 7-switch topology of Fig. 2 the topology; 2-D HyperX (k) [9], where k is the number of switches in each dimension of a 2-D mesh; and CiscoDC, Cisco's recommended data center network [12], a three-layer tree with two core switches, and with parameters (m, a) where m is the number of aggregation modules, and a the number of access switch pairs associated with each aggregation module.

Source publication
Conference Paper
Full-text available
Operators of data centers want a scalable network fab- ric that supports high bisection bandwidth and host mo- bility, but which costs very little to purchase and admin- ister. Ethernetalmost solves the problem - it is cheap and supports high link bandwidths - but traditional Ethernet does not scale, because its spanning-tree topology forces traffi...

Context in source publication

Context 1
... a parallel algorithm, described in [25], based on graph-coloring heuristics, which yields speedup linear in the number of edge switches. Fig. 2 shows a relatively simple wiring topology with seven switches. One can think of this as a 1-level tree (with switch #1 as the root), augmented by adding three cross-connect links to each non-root switch. Fig. 3 shows how the heuristic greedy algorithm (Alg. 2) chooses seven VLANs to cover this topology. VLAN #1 is the original tree (and is used as the default spanning ...

Similar publications

Article
Full-text available
The architecture of several data centers have been proposed as alternatives to the conventional three-layer one.Most of them employ commodity equipment for cost reduction. Thus, robustness to failures becomes even more important, because commodity equipment is more failure-prone. Each architecture has a different network topology design with a spec...
Article
Full-text available
Operators of data centers want a scalable network fabric that supports high bisection bandwidth and host mobility, but which costs very little to purchase and administer. Ethernet almost solves the problem – it is cheap and supports high link bandwidths – but traditional Ethernet does not scale, because its spanning-tree topology forces traffic ont...

Citations

... Similar to PortLand, MOOSE [6] also employs the MAC address and hierarchical address rewriting to overcome the scalability limitations of Ethernet. MOOSE is not limited to a specific topology as PortLand, but lacks the consideration of addressing any traffic engineering [27,28]. ...
Article
Full-text available
Data centers are now the basis for many Internet and cloud computing services. Trends toward multi-core processors, end-host virtualization, and commodities of scale are pointing to future single-site data centers with millions of virtual end-hosts. The Ethernet/IP style layer 2 and layer 3 network protocols are facing some mixture of inherent limitations in supporting such large topologies: lack of scalability, difficult to management, inflexible in communication, limited support for virtual machine migration. Although several large layer 2 network technologies have been proposed in recent years, they still have several weaknesses that impede them from practical applications such as inflexible, broadcast storms, un-scalability and un-interoperability with existing devices. Software defined networking (SDN) is an emerging promising solution to the above problems due to its outstanding characteristics of control plane and data plane separation, centralized and flexible network management. However, the limited efficiency of the centralized SDN Controller and the large number of routing rules needed in switches are the two faced main challenges in scalability when adopting the existing SDN solutions in large data centers. Therefore, this paper proposes a novel SDN based large layer 2 network fabric for data centers: SFabric, which deals with the two challenges by highly reducing the interactions between the Controller and switches in computing and constructing the paths among switches in advance, and decreasing the number of routing rules in tagging and routing packets at switch levels. A prototype is developed and experimental results prove the good efficiency and scalability of the proposed method.
... For the latter where the network capacity is under pressure (e.g., due to dynamic network load spikes), Dart falls back on DCQCN's throttling which may be unavoidable. Load balancing [2,13,18,24,30,39,42,52] can alleviate localized in-network congestion but not receiver congestion, and usually reorders packets which is not supported by RDMA. ...
... In contrast to these above schemes all of which reorder packets which is not supported by RDMA, IOFD is designed to deflect without packet reordering. SPAIN [39] and CONGA [2] also avoid packet reordering. However, SPAIN pre-computes multiple paths which are mapped to different VLANs but such precomputation may be slow in reaction to short flows in a datacenter. ...
Preprint
Though Remote Direct Memory Access (RDMA) promises to reduce datacenter network latencies significantly compared to TCP (e.g., 10x), end-to-end congestion control in the presence of incasts is a challenge. Targeting the full generality of the congestion problem, previous schemes rely on slow, iterative convergence to the appropriate sending rates (e.g., TIMELY takes 50 RTTs). We leverage the result in several papers that most congestion in datacenter networks occurs at the receiver. Accordingly, we propose a divide-and-specialize approach, called Dart, which isolates the common case of receiver congestion and further subdivides the remaining in-network congestion into the simpler spatially-localized and the harder spatially-dispersed cases. For receiver congestion, Dart proposes direct apportioning of sending rates (DASR) in which a receiver for n senders directs each sender to cut its rate by a factor of n, converging in only one RTT. For the spatially-localized case, Dart employs deflection by adding novel switch hardware for in-order flow deflection (IOFD) because RDMA disallows packet reordering, providing fast (under one RTT), light-weight response. For the uncommon spatially-dispersed case, Dart falls back to DCQCN. Small- scale testbed measurements and at-scale simulations, respectively, show that Dart achieves 60% (2.5x) and 79% (4.8x) lower 99th percentile latency, and similar and 58% higher throughput than RoCE, and TIMELY and DCQCN
... The centralization of many services in datacenters, and the steep increase in traffic loads, has pushed research and industrial communities towards the development of innovative solutions for datacenter's networking [1], [2], [3], [4]. ...
... For instance, spanning tree is utilized by Layer 2; therefore, only one path would be available for a pair of sender receiver nodes at a time. There are some proposal to support multipath with Layer 2 such as the one conducted by [16]. They proposed exploiting the redundant paths in the network using an algorithm that calculate a set of available paths and combine them into another set of trees. ...
Article
Full-text available
this paper proposes a new load balancing algorithm for data center networks by means of exploiting the characteristics of Software Defined Networks. Mininet was utilized as an emulation tool for the purpose of emulating and evaluating the proposed design, Miniedit was utilized as a GUI tool for the same purpose. In order to obtain a realistic environment to the data center network, Fat-Tree topology was utilized with the following parameters; 4 pods, 16 edge switches, 16 aggregation switches, 4 core switches, and 16 hosts. Different scenarios and traffic distributions were applied in order to cover as much possible cases of the real traffic. POX controller was chosen as an SDN controller. The suggested design showed outperformance when compared to the traditional scheme in term of throughput and loss rate for all the evaluated scenarios. The first scenario assumes joining of new hosts while in the second scenario; there was an increase in the demand of the already established connections. The proposed algorithm showed a loss free performance in the first scenarios, whereas, the traditional scheme presented 15% to 31% loss rate for the same scenario. In the second scenario, the proposed algorithm recorded up to 81% improvement in the loss rate when compared to the traditional scheme. Moreover, the proposed algorithm showed a superiority over the traditional scheme in term of throughput, where it maintained the throughput intact without any reduction in the first scenario in contrast to the traditional scheme that underwent from a considerable degradation in the throughput value. The traditional scheme underwent from an average throughput reduction of 5Mbps in the case of joining of new hosts (first scenario). In the second scenario, both schemes underwent from a throughput reduction, however, the proposed scheme always showed superiority over the traditional scheme, whereas, it recorded up to 16.6% improvement in the throughput average value.
... Moreover, the need for traffic optimization is now common to several other types of networks besides traditional ISPs networks. This requirement is now common to intra data center [7,15,21] and inter data center networks [10,9]. ...
... SPAIN is a multi-path data center routing proposal [15] which also uses, for each pair of edge nodes, a set of paths computed independently of a traffic matrix. The algorithm has as input a network graph, a pair of edge nodes (origin and destination) and the number k of paths from the origin to the destination to be computed. ...
... That behaviour is due to the terminating condition of the SPAIN algorithm. In each step, a shortest path is computed and the cost of the path edges is increased by a constant large number [15], in order to prevent their use in the following iterations, until k different paths have been found or the new path has already been computed in a previous iteration. By using a deterministic shortest path algorithm (such as Dijkstra's algorithm), a path can be computed twice before all distinct shortest paths have been found, in which case the SPAIN algorithm stops prematurely and returns less than k paths. ...
Article
To answer traffic engineering goals, current backbone networks use expensive and sophisticated equipments, that run distributed algorithms to imple- ment dynamic multi-path routing (e.g., MPLS tunnels and dynamic trunk rerout- ing). We think that the same goals can be fulfilled using a simpler approach, where the core of the backbone only implements many a priori computed paths, and most adaptation to traffic engineering goals only takes place at the edge of the network. In the vein of Software Defined Networking, edge adaptation should be driven by a logically centralized controller that leverages the available paths to adapt traffic load balancing to the current demands and network status. In this article we present two algorithms to help building this vision. The first one selects sets of paths able to support future load balancing needs and adaptation to network faults. As the total number of required paths is very important, and their continuous availability requires many FIB entries in core routers, we also present a second algorithm that aggregates these paths in a reduced number of trees. This second algorithm achieves better results than previously proposed algorithms for path aggregation. To conclude, we show that off-the-shelf equipment supporting simple protocols may be used to implement routing with these trees, what shows that simplicity in the core can be achieved by using only trivially available proto- cols and their most common and unsophisticated implementation
... Mudigonda et al.[28]introduces NetLord, a multi-tenant network architecture that encapsulates tenants' Layer-2 packets to provide full address-space virtualization. The forwarding mechanism uses a so-called smart path assignment in networks (SPAIN)[29]to distribute traffic among multiple paths. Servers run an online algorithm to test the connectivity of the destination and randomly select a path to send each flow. ...
Article
Full-text available
Multipath forwarding has been recently proposed to improve utilization in data centers leveraged by its redundant network design. However, most multipath proposals require significant modifications to the tenants’ network stack and therefore are only feasible in private clouds. In this paper, we propose the Two-Phase Multipath (TPM) forwarding scheme for public clouds. The proposal improves tenants’ network throughput, whereas keeping unmodified network stack on tenants. Our scheme is composed of a smart offline configuration phase that discover optimal disjoint paths, and a fast online path selection phase that improves flow throughput at run time. A logically centralized manager uses a genetic algorithm to generate and install sets of paths, summarized into trees, during multipath configuration, and a local controller performs the multipath selection based on network usage. We analyze TPM for different workloads and topologies under several scenarios of usage database locations and update policies. The results show that our proposal yields up to 77% throughput gains over previously proposed approaches.
... The tenant data between the egress and ingress switches are conveyed over the Layer 2 network thanks to VLANs. In order to choose which VLAN to use, Net-Lord applies the SPAIN [62] selection algorithm. However to support the SPAIN multipath technique and stock per-tenant configuration information, NetLord uses Configuration Repository which are databases. ...
... In order to benefit from a high-bandwidth resilient multipath fabric using Ethernet switches, NetLord relies on SPAIN [62]. Like the other solutions using a centralized controller, it might be necessary to have redundant configuration repositories, not only for availability but also for improvement in performance. ...
Article
Full-text available
The Infrastructure-as-a-Service (IaaS) model is one of the fastest growing opportunities for cloud based service providers. It provides an environment that reduces operating and capital expenses while increasing agility and reliability of critical information systems. In this multitenancy environment, cloud-based service providers are challenged with providing a secure isolation service combining different vertical segments, such as financial or public services, while nevertheless meeting industry standards and legal compliance requirements within their data centers. In order to achieve this, new solutions are being designed and proposed to provide traffic isolation for a large numbers of tenants and their resulting traffic volumes. This paper highlights key challenges that cloud-based service providers might encounter while providing multi-tenant environments. It also succinctly describes some key solutions for providing simultaneous tenant and network isolation, as well as highlights their respective advantages and disadvantages. We begin with Generic Routing Encapsulation (GRE) introduced in 1994 in "RFC 1701", and will conclude with today’s latest solutions. We detail fifteen of the newest architectures and then compare their complexities, the overhead they induce, their VM migration abilities, their resilience, their scalability, and their multi data center capacities. This paper is intended for, but not limited to, cloud-based service providers who want to deploy the most appropriate isolation solution for their needs, taking into consideration their existing network infrastructure. This survey provides details and comparisons of various proposals while also highlighting possible guidelines for future research on issues pertaining to the design of new network isolation architectures
... Al-Fares et al. [4] and Niranjan Mysore et al. [5] enable routing without a link-state routing protocol by restricting a network topology to the one called fat-tree. Mudigonda et al. [6] uses VLANs for providing multi-paths. However, these approaches cannot utilize autonomous and dynamic routing that quickly reflects link addition/deletion or a link failure. ...
Article
We have developed an automatic network configuration technology for flexible and robust network construction. In this paper, we propose a two-or-more-level hierarchical link-state routing protocol in Hierarchical QoS Link Information Protocol (HQLIP). The hierarchical routing easily scales up the network by combining and stacking configured networks. HQLIP is designed not to recompute shortest-path trees from topology information in order to achieve a high-speed convergence of forwarding information base (FIB), especially when renumbering occurs in the network. In addition, we propose a fixed-midfix renumbering (FMR) method. FMR enables an even faster convergence when HQLIP is synchronized with Hierarchical/Automatic Number Allocation (HANA). Experiments demonstrate that HQLIP incorporating FMR achieves the convergence time within one second in the network where 22 switches and 800 server terminals are placed, and is superior to Open Shortest Path First (OSPF) in terms of a convergence time. This shows that a combination of HQLIP and HANA performs stable renumbering in link-state routing protocol networks.
... We measure the Path Quality by evaluating the shortest paths of each topology . The shortest path length is suitable to evaluate the behavior of the quality of paths in the network, since it is the basis of novel routing mechanisms that can be used in DCs, such as TRILL [26], IEEE 802.1aq [27], and SPAIN [28]. Hence, we define the following metric: Average Shortest Path Length. ...
Article
Full-text available
The architecture of several data centers have been proposed as alternatives to the conventional three-layer one.Most of them employ commodity equipment for cost reduction. Thus, robustness to failures becomes even more important, because commodity equipment is more failure-prone. Each architecture has a different network topology design with a specific level of redundancy. In this work, we aim at analyzing the benefits of different data center topologies taking the reliability and survivability requirements into account. We consider the topologies of three alternative data center architecture: Fat-tree, BCube, and DCell. Also, we compare these topologies with a conventional three-layer data center topology. Our analysis is independent of specific equipment, traffic patterns, or network protocols, for the sake of generality. We derive closed-form formulas for the Mean Time To Failure of each topology. The results allow us to indicate the best topology for each failure scenario. In particular, we conclude that BCube is more robust to link failures than the other topologies, whereas DCell has the most robust topology when considering switch failures. Additionally, we show that all considered alternative topologies outperform a three-layer topology for both types of failures. We also determine to which extent the robustness of BCube and DCell is influenced by the number of network interfaces per server.
... Once the control plane has the topology information, it can implement a wide variety of traffic engineering policies to choose a routing path for each flow or even each packet. The policies range from distributed ones to centralized ones, as discussed extensively in literature [6, 7, 8, 9]. We design new algorithms for Sourcey to perform topology discovery and monitoring. ...
... The major technical problem is topology discovery and monitoring . Once the control plane has an updated view of the topology, it is possible to choose routes for each flow or packet with different traffic engineering policies, as discussed in [6, 7, 8, 9] . In the remainder of this paper, we focus on how to implement topology discovery and topology monitoring using server-based mechanisms. ...
Conference Paper
We present Sourcey, a new data center network architecture with extremely simple switches. Sourcey switches have no CPUs, no software, no forwarding tables, no state, and require no switch configuration. Sourcey pushes all control plane functions to servers. A Sourcey switch supports only source-based routing. Each packet contains a path through the network. At each hop, a Sourcey switch pops the top label on the path stack and uses the label value as the switch output port number. The major technical challenge for Sourcey is to discover and monitor the network with server-only mechanisms. We design novel algorithms that use only end-to-end measurements to efficiently discover network topology and detect failures. Sourcey explores an extreme point in the design space. It advances the concept of software-defined networking by pushing almost all network functionality to servers and making switches much simpler than before, even simpler than OpenFlow switches. It is a thought experiment to show that it is possible to build a simple data center network and seeks to raise discussion in the community on whether or not current approaches to building data center networks warrant the complexity.