:
Cisco IT is transforming its data centers with solutions that help to realize the company’s Data Center 3.0 vision, which employs a unified network fabric to connect servers and storage devices in a way that is resilient, scalable, and easy to manage. The transformation occurs in three stages:
This deployment report focuses on the first stage.
Until now, Cisco has used a traditional Cisco Ethernet switching infrastructure and Fibre Channel switches at the distribution layer. Cisco IT is consolidating data center I/O from multiple 1 Gbps Ethernet connections and 4 Gbps Fibre Channel connections to a pair of 10 Gbps Ethernet connections through a lossless, high-performance, low-latency switching fabric.
“We wanted to consolidate to 10 Gbps and also increase port density throughout the network’s core and distribution layers,” says Mauricio Arregoces, engineering manager at Cisco.
Business drivers for this change include:
To meet these business requirements, Cisco IT needed a new data center I/O architecture that would:
“Adding 10 Gbps connections is not new,” says Tom Settle, technical staff member, IT Network and Data Center Services Engineering. “What’s new is scaling port density at the distribution layer.”
When Cisco IT began planning for I/O consolidation in 2004, the only option was InfiniBand, a Layer 2 proprietary method of connecting hosts with storage, SAN switches, and networking devices. But InfiniBand requires proprietary copper cables and highly scalable gateways, both of which increase capital and operational expense. In addition, Cisco IT prefers to improve operational efficiency by standardizing on Ethernet and IP standards-based technologies whenever available, which made FCoE the preferred choice.
Before adopting FCoE, however, Cisco IT needed to ensure that faster I/O speeds would not result in dropped storage traffic. Dropped packets are not a major problem for network data traffic because the receiving node can either ask the sending node to retransmit or just ignore the missing data. But storage systems have less tolerance for dropped packets; therefore, Cisco needed an FCoE solution that would eliminate dropped packets, a “lossless” fabric.
Cisco IT deployed Cisco Nexus 7000 and 5000 Series Switches as the platform for unified I/O. The Nexus 7000 Series meets Cisco’s needs for the distribution layer because of its scalability, up to 512 Gigabit Ethernet ports and up to 15 Tbps backplane capacity. A standards-based switch, the Cisco Nexus 7000 is built to support future 40 Gbps and 100 Gbps Ethernet. The Cisco Nexus 5000 Series meets Cisco’s needs for a top-of-rack access switch because of its unified I/O support, high port density, and low-latency (less than 3 milliseconds) lossless fabric, which will improve application performance.
In September 2008, Cisco IT deployed the Nexus 7000 and 5000 Switches along with Cisco Catalyst 6500 Series Switches in a controlled production environment at its data center in Mountain View, California. The test environment includes the following components:
Cisco IT is deploying the Cisco Nexus 7000 and 5000 Switches in four steps:
The Nexus 7000 and Nexus 5000 Series Switches use the NX-OS operating system, which closely resembles the Cisco IOS Software. Cisco IT staff can configure and implement NX-OS using their existing skills. “Staff who have solid experience with the Cisco IOS Software or the MDS SAN-OS software learn how to configure the NX-OS in one to two hours,” says Ng.
Cisco IT plans to integrate management of the Cisco Nexus 7000 and 5000 Switches into existing system management environments using Cisco Data Center Network Manager.
In October 2008, Cisco IT certified the Nexus-based pods as ready for production. Five production business applications are operating in the Nexus pod environment, including News@Cisco, a financial system, and a database used by the Office of the Chairman and CEO.
Cisco IT anticipates the following results from the initial deployment.
I/O consolidation is the first step on the journey to a unified fabric, according to Norman. “This is a low-risk step because initially we are only using FCoE as an I/O access technology in a single rack, not across the whole data center fabric,” he says. “Our goal during this step is to expose the teams to FCoE so that we can fully understand the technology and its effects on the Cisco IT organization.”
The 18-slot Cisco Nexus 7000 Switch can scale up to 512 10 Gbps Ethernet ports. “Increased port density enables us to attach more access switches and gives us the bandwidth to scale aggregation at the switch layer,” says Settle.
Cisco IT is calculating power savings at both the system level and data center level:
Cisco IT expects to significantly lower cabling costs by reducing the number of connections to each server, as well as the connections from racks to the distribution switches. “We can just lay fiber once and then never again have to worry about separate cabling for data and storage traffic,” says Sidney Morgan, manager, Cisco on Cisco IT.
When Cisco IT begins using the dual FCoE converged network adapters with Nexus 5000 deployments on a large scale, provisioning time for new servers will decrease substantially. “Today, provisioning a data center server requires opening multiple service requests, including racking the server, installing two HBAs [host bus adapters], and connecting two set of cables,” says Ramachandra-Rao. “Adopting FCoE will improve our SLA [service-level agreement] because we’ll eliminate service requests for the second HBA and cable.”
The Cisco Nexus 7000 Switch increases bandwidth capacity for servers and clients that communicate across access and aggregation layers. “We can expect faster response times because fewer IP packets will be dropped due to congestion,” says Ramachandra-Rao.
Previously, Cisco IT’s network and storage operations groups operated separately. Unified I/O and FCoE are helping them converge, increasing the efficiency of the Cisco IT organization. Norman likens the change to when Cisco converged its voice and data networks. “To adopt VoIP, we cross-trained our TDM voice engineers and networking engineers to collaborate,” he says. “Now, as we adopt FCoE, previously separate storage and networking skill sets will also converge.”
Cisco IT is already restructuring its server, storage, and networking teams according to the PDIO model: planning, design, implementation, and operations. The design team works on the end-to-end solution, including storage, server, and orchestration components. “We’ve begun establishing Role-Based Access Control procedures for our storage and networking teams, to avoid conflicts,” says Ng.
Now that the Nexus-based pod has been validated for the production environments, Cisco IT will begin using the Nexus 5000 with dual FCoE converged network adapters on the servers to reduce costs for hardware, cabling, power, and cooling.
The next data centers to deploy Nexus pods will be the engineering and development data centers in San Jose, California; the new production data center in Richardson, Texas, and existing data centers in Research Triangle Park, North Carolina, and Boxborough, Massachusetts. Cisco IT will coordinate the pod deployments with Cisco’s Fleet upgrade program for refreshing network equipment at regularly scheduled intervals.