Table Of Contents
Solution 1—Bulk Transfer of Database Information
Solution 2—Sysplex Complex for Large Number of Transactions
Solution 3—Replacement of Remote and Local FEPs
Solution 4—Conversion from Native SNA 3270 to TN3270 Sessions
Solution 5—Mainframe-to-Mainframe Bulk Data Transfer
White Paper
Customer Performance Testing of Cisco's Channel Interface Processor
Executive Overview
Performance testing is a critical part of the evaluation of any product. The value of any particular testing is, however, dependent upon the quality of both the test plan and the implementation of that plan. Performance testing is most valuable when the plan is detailed and comprehensive and is implemented rigorously. To ensure quality in both the plan and the implementation of the plan, a thorough understanding of the product and the environment in which the product operates is necessary. In order to develop and implement a testing plan best designed to provide meaningful performance results, Cisco has involved the users of its products and engaged them in a week-long testing effort. The results of that effort are set forth in this report.At the request of Cisco's InterWorks Business Unit (IBU), a customer user acceptance team (UAT) was formed to evaluate the raw performance capabilities of Cisco's Channel Interface Processor. This team was tasked with identifying "real-world" applications for the CIP within a production environment and then to devise relevant testing scenarios that would push the performance envelope of the CIP as it performed those functions. Three of Cisco's largest users of both routers and CIPs (MCI, EDS, and Citicorp) provided team members who were all experienced with routers, mainframes, and performance testing. The team included the following people:
•
Evan Escobedo, MCI
•
Joe Berkeley, MCI
•
Michel Lowe, Citicorp
•
Steve Lynch, Citicorp
•
Joe White, EDS
•
Gerald King, EDS
The combined experience of the group provided for expertise in WAN/LAN configuration, TCP/IP, SNA, Multiple Virtual Storage (MVS), TN3270, Cisco SNA (CSNA), and MultiPath Channel (MPC). The team members have installed and supported various mainframe access methodologies, including SNA gateways, TN3270 servers, CIPs, Interlink, IBM TCP, and front-end processors (FEPs) within their data centers. The team has a combined total experience of approximately 50 years in data center technology, and approximately 20 years in router technology.The team agreed that this initial phase of testing would attempt to identify the raw performance capabilities of the CIP for the following functions:
•
IP Datagram
•
CSNA
•
TN3270 Server
•
Cisco MPC
The team members also agreed to compile subjective qualities related to the CIP, and to use their previous experience to gauge how the CIP would fall within the following categories:
•
Installation
•
Configuration
•
Scalability
•
Manageability
Summary of Test Results
Team members made the following comments regarding the test results:•
"Our role was first and foremost to ensure that the tests represented the types of business problems common in our respective data centers. We didn't want to find out how many angels could dance on the head of a pin. We wanted to take a hard-nosed look at real solutions to real problems."—Michel Lowe, Citicorp (michel.lowe@citicorp.com)
•
"Raw throughput between the CIP and the mainframe is nice; but our real concern is how well a mainframe solution interacts with the rest of our network. Since our network is 99.99 percent Cisco, the CIP provides us with a `plug-and-play' opportunity. By using the CIP, we didn't have to modify any of our current routing, management, or access processes just to accommodate our mainframe solution."—Steven M. Lynch, Citicorp (steven.lynch@citicorp.com)
•
"These tests conducted recently on Cisco's CIP technology have reaffirmed that the product is superior from a scaling, performance, and reliability standpoint. This is also why we chose to use Cisco's CIP product to migrate a large 50-plus FEP network to an all-APPN network in a very short time frame."—Joe Berkeley, MCI (joseph.w.berkeley@mci.com)
•
"When you have a powerful network our size, it is critical to make sure the platform on which we build is the best. From a customer's perspective, these tests put an end to all uncertainty about how the CIP2 compares to other products in the industry."—Evan Escobedo, MCI (evan.escobedo@mci.com)
Table 1 presents a summary of the test results.
Conclusions
All the tests showed that the Cisco data center solutions are very scalable. Each test was performed with the purpose of saturating the CIP card. During these tests, the team observed that the main router CPU—that is, the Route Switch Processor (RSP)—was minimally utilized. This point is very important when considering a real-life solution, because the data center router must perform many tasks, not just the channel functions:•
IP routing—that is, Open Shortest Path First (OSPF), Enhanced Interior Gateway Routing Protocol (EIGRP), and so on
•
Security (access lists, filtering, and other)
•
Data-link switching (DLSw) for SNA over IP
•
Dependent Logical Unit Requester (DLUR) for Advanced Peer-to-Peer Networking (APPN) support
•
Compression to optimize data transfer through limited bandwidth
In fact, the tests showed that multiple CIP processors could be installed in a single data center router and the RSP would still have ample capacity for nonchannel functions.
The CIP was also shown to be quite robust. Even under 100-percent load, the CIP was capable of maintaining all sessions during the tests. For example, during the SNA test, 6000 physical unit (PU) and 6000 logical unit (LU) sessions were supported and all PU and LU sessions were maintained. During the TN3270 server test, 16,000 TN3270 sessions were established, and every session was maintained—even under 100-percent CIP load.
The results presented in this report are the result of testing that consumed an entire week. To ensure that the test would provide fair and reliable data, the UAT spent a great deal of time ensuring that all bottlenecks were understood, including the tuning of Virtual Telecommunications Access Method (VTAM), TCP/IP for MVS, and the application generation tools. Therefore, the team is extremely confident that the results obtained in this series of tests are a true reflection of real-world performance using the CIP and that the tests are more accurate and meaningful than other performance tests accomplished without the benefit of this solid understanding of the product and the environment.
During the testing, MCI monitored the progress and gave significant input with respect to configuration of the IBM NetMarks application. After the testing was completed, MCI made available results achieved at its own site prior to the UAT testing. These results also show that the performance characteristics of the CIP far exceed those determined by the third-party test group. According to MCI's report, "The tests show that the (Cisco) 7513 delivers up to twice the SNA routing throughput (4K frames) and equivalent TCP/IP routing throughput of the (Cisco) 2216 router using a configuration that more closely matches the channel-attached router configuration used by IBM for recent benchmark testing between the (Cisco)7507 and the (Cisco) 2216 routers."
Special mention must be made about the TCP performance. During this test, the CIP was not fully saturated. In fact, it was only 30-percent utilized, as noted in the MCI report: "Note: the above TCP/IP numbers did not fully utilize the CIP card or the mainframe channel. In fact the CIP's utilization did not rise above 30 percent during the entire test period for these test runs. It is suspected that the client/server software used for the testing was not receiving adequate priority because it runs under a Time Sharing Option (TSO) address space on the MVS system."
If you would like a copy of the entire document, entitled "MCI Corporation CIP Benchmark Testing Results," please contact Joe Berkeley at joseph.w.berkeley@mci.com.
Application Solutions
This section applies the results of the UAT testing to typical, real-world applications and network scenarios—for example, backing up large amounts of data from UNIX servers to the mainframe via TCP/IP, converting thousands of SNA LUs to TN3270 sessions, and so forth.Solution 1—Bulk Transfer of Database Information
Many companies have implemented a client/server environment by using the database applications available on distributed UNIX or NT servers. They also leverage the centralized nature of mainframes to perform backups of this distributed data. A typical example of an application used to perform this task is IBM's backup product, ADSTAR Distributed Storage Manager (ADSM). ADSM is used to store large amounts of data from distributed platforms to a central mainframe resource (either direct access storage device [DASD] or tape). Figure 1 shows the schematic for this scenario, in which four Sun servers are used for distributed database applications. Each night, the servers use ADSM to transfer 250-GB data to the centralized Hitachi Data Systems (HDS) Skyline mainframe for backup during a three-hour window.The UAT testing of the CIP in IP datagram mode determined that a single CIP processor can transfer 18.4 MBps across two Enterprise System Connection (ESCON) channels. Therefore, two CIP processors can transfer 36.8 MBps. In one hour, the data center router can transfer:
133 GB per hour, or 36.8 MB per second, x 60 seconds per minute x 60 minutes per hour
Therefore, a Cisco 7507 with two CIP cards with dual ESCON interfaces (four ESCON channels), and two Asynchronous Transfer Mode (ATM) interface processors is capable of transferring 133 GB per hour. To determine the amount of time required to transfer the 250 GB of data in the bulk data transfer application example:
250 GB / 133 GB per hour = 1.88 hours, or 112 minutes
As these calculations demonstrate, the Cisco data center router can support the required data transfer rate.
Figure 1 Bulk Data Transfer from Distributed UNIX Servers to Central Mainframe
Solution 2—Sysplex Complex for Large Number of Transactions
A very large Customer Information Control System (CICS)-based credit card authorization application processes more than 20 million transactions per hour. As shown in Figure 2, the network contains five 985-mips HDS Skyline mainframes in a Parallel Sysplex complex, and one data center router with two CIP processors and four ESCON channels (the second router serves as backup for the primary router).UAT testing of CSNA determined that a single CIP processor can support 3200 transactions per second, or:3200 transactions per second x 60 seconds per minute x 60 minutes per hour = 11.5 million transactions per hour
Therefore, across two CIP cards, the data center router can support 23 million CICS transactions per hour. The application requires 20 million CICS transactions per hour, so a single Cisco 7507 with two CIP processors and appropriate LAN/WAN interfaces can support the required transaction workload.
The second data center router shown in Figure 2 is configured with the same Media Access Control (MAC) address and can be used either to split the load with the primary router or as a hot standby.
Solution 3—Replacement of Remote and Local FEPs
For many years, FEPs provided Synchronous Data Link Control (SDLC) connectivity for numerous companies. As the trend toward LAN-attached workstations accelerated, so did the need for LAN-attached gateways. The FEP accommodated this need well, but soon large SNA networks requiring SNA and IP support quickly outgrew its capacity.This scenario presents a solution in which 10 FEPs supporting 30,000 LUs and approximately 3000 transactions per second (see Figure 3) are replaced by data center routers.
Figure 2 Sysplex Complex Scenario
Figure 3 Network Before FEP Replacement
The number of data center routers and CIP processors required to replace the FEPs can be calculated using the results of the UAT testing. These results showed that a single CIP was capable of supporting 16,000 LUs and switching approximately 3200 transactions per second. Therefore, two data center routers (one for backup), each with two CIP processors and four ESCON channels, can support up to 32,000 LUs per router and a total of 6400 transactions per second. This new configuration, shown in Figure 4, provides even higher performance than the original configuration using FEPs.Figure 4 Network After FEP Replacement with Data Center Routers
Solution 4—Conversion from Native SNA 3270 to TN3270 Sessions
With the growing demand for TCP/IP-based communications and applications, many enterprises with large SNA networks are finding it increasingly necessary to provide mainframe access to IP clients. One solution is to replace the network's FEPs with data center routers running TN3270 Server and install an IP backbone.In this scenario, shown in Figure 5, the network contains 70 FEPs, providing access to 500,000 LUs statically defined within the FEP or in VTAM. According to statistics gathered about the network, only 50 percent of the LUs are active at any given time.
Cisco's TN3270 Server offers a dynamic LU definition feature, which allows it to dynamically request LUs on demand. Because only 250,000 LUs are active in this scenario at any point in time, the TN3270 Servers will be configured to support up to 250,000 dynamically created LUs.
As shown by the UAT testing, a single CIP can support 16,000 LUs and switch up to 855 transactions per second (TPS). However, to provide fail-safe redundancy, each CIP will be configured to support 8000 sessions. Therefore, to support 250,000 active LUs, the network will require 15 Cisco 7505 routers with a total of 30 CIP processors, as shown in Figure 6. If a failure occurs in any one router, any of the remaining routers will be able to support its share of sessions. In fact, almost half the routers could fail and the remaining routers would support the total number of sessions.
Figure 5 Large SNA Network
Figure 6 SNA 3270 Migration to TN3270 Sessions
Solution 5—Mainframe-to-Mainframe Bulk Data Transfer
This scenario provides an application solution example using Cisco's High Performance Routing (HPR) over the MPC feature of the CIP, using its bulk data throughput capability.Companies with multiple data center sites require high-speed data transfer between the data centers, primarily for disaster recovery reasons. The application used to transfer this data could be IBM's NetView Distribution Manager (NDM) or XCOM. This scenario requires that the solution be capable of transferring 350 GB of data within six hours.
As shown by the UAT testing, a single CIP can support data transfer in excess of 10.85 MBps. This rate equates to:
10.85 MBps x 60 seconds per minute x 60 minutes per hour = 39 GB per hour
Therefore, two CIPs can provide a total of 78 GB per hour.
If we assume that the applications are efficient enough and that high-speed DASD is used, then bulk data transfer requirements can be addressed using two Cisco 7500 series routers, each with a CIP, two ESCON channels, and one ATM interface card, as shown in Figure 7.
To calculate the amount of time required for this router to transfer 350 GB of data:
350 GB of data / 78 GB per hour = 4.48 hours (a number that is well within the allotted six hours)
Figure 7 Mainframe-to-Mainframe Bulk Transfer
Test Plan—IP Datagram
The previous section discussed the testing results and highlighted areas in which these results show an impact on various networks. This section discusses in detail the test plan and the results of each test.Test Overview
The IP Datagram support uses the Common Link Access for Workstations (CLAW) protocol to transmit IP frames across the channel to the mainframe. The team identified several factors that could affect overall throughput performance of any channel-attached device:•
Average packet size and maximum transmission units (MTUs)
•
Application window size (or acknowledgments)
•
Input/output (I/O) requirements
The team's first objective was to find the throughput and processor limit of the CIP. To achieve this, the team set out to process as much data through a single CIP ESCON connection as possible without concern for any system responses or acknowledgments. The ping function of the RSP4 was selected as the traffic generator and was configured to send one million pings of 4 KB without waiting for a response to the IP address of the mainframe. A 4-KB packet size was selected because it is the maximum packet size from a Token Ring, and the team wanted to move as much data as possible with as little inter-packet delay as possible. The test was unable to saturate the CIP or the ESCON channel with a single ping process and had to start a second ping process to saturate the ESCON channel. Thus the channel limitation, not the performance of the CIP, was the limiting factor. The CIP was able to pass 12.3 MBps across the channel while using less than 40 percent of the capacity of the CIP.
The next two tests were designed to reflect reality, where applications define their own packet size, require acknowledgments, and perform I/O requests. Testing introduced the overhead associated with a typical TCP application across a routed network. In the first test, a file transfer protocol (FTP) was run from two Fast Ethernet-connected NT workstations, one Fiber Distributed Data Interface (FDDI)-connected UNIX workstation, and one FDDI-connected Windows workstation to act as the traffic generators, writing to a null device on a CIP/ESCON-connected mainframe. All the workstation LANs were connected to the channel-attached Cisco 7513 router. This configuration was designed to test the interaction of the IP Datagram features of the CIP with the routing capabilities of the RSP4. The team was unable to saturate either the CIP or the ESCON channel by running only multiple 100-MB FTPs from each of the workstations. To fully saturate the ESCON channel, the team had to start a single copy of the ping test along with multiple FTPs.
The second real-world test was to pass TCP traffic between two mainframe logical partitions (LPARs), using both ESCON connections on a single CIP card. This test was designed to simulate the act of bulk data transfers, without the added overhead of any WAN/LAN connections. The team realized that in most enterprise applications, the transfer would probably rely on the use of either WAN or LAN interconnections; for the purpose of identifying raw throughput, it was agreed that those factors would only add to the complexity of the testing procedures and so would represent a fixed variable. Furthermore, the team wanted to determine how much application data could be transferred through the CIP. An MVS-based TCP streaming application used by EDS served as the traffic generator and was configured to provide 12 client/server connections between the two LPARs on a single HDS 280-mips mainframe. With this test, the team was able to saturate both of the ESCON channels of the CIP before its processor limit was reached. In addition, the test revealed a drop in total throughput per ESCON channel compared to the ping test. The test was able to generate an average rate of 9.2 MBps across each of the ESCON channels, versus the 12.3 MBps that the ping test generated. This result was anticipated, however, because the test added the overhead of the application back into the performance equation.
Knowing the results of these three tests, the team agreed that the CIP IP Datagram solution was by far the best-performing IP implementation used to date. Although the tests were not able to fully saturate either the CIP or the Cisco 7513, they were able to verify the effects of its previously defined limiting factors. Watching the throughput values of each of the test components, the team observed that throughput increased with larger packets, larger window size, and faster I/O devices, as expected.
Interestingly, in every test the limiting factor was th ESCON channel. This fact was clearly demonstrated by the third test, in which the CIP saturated both ESCON channels, yet still maintained less than 45 percent CIP CPU utilization. This test also demonstrated the effects of application overhead to total throughput. The drop in total data transferred was directly related to the server sending a large data packet and the client responding with a small acknowledgment packet. For each small acknowledgment sent, the same amount of system (that is, ESCON) overhead was incurred as if it was a 4-KB packet.
The tests also showed that the performance of the CIP did not have a one-to-one effect on the overall performance of the Cisco 7513. Because the CIP has its own processor, it can run at higher than 40-percent utilization without causing the RSP4 utilization to increase by more than 3 to 4 percent.
The team was impressed by the ease of installation and use of the CIP when configured for IP Datagram; the team experienced no Cisco-related problems in building the testbed environments. In fact, the team agreed that a Cisco 7513 router with multiple CIPs is an IP platform whose capacity far exceeds what anyone would want to put through a single device.
Test Goal
The goal of the IP Datagram test was to determine the maximum amount of data throughput in MBps that a CIP can attain on a single ESCON port and on dual ESCON ports.Test Setup
In order to produce the amount of traffic required to saturate a CIP, the test required various traffic-generating tools. Three methods were employed: using multiple workstations, using a simple ping (Internet Control Message Protocol [ICMP]) with no response, and using a host application that provided TCP server and client instances.Figure 8 shows the basic setup for the mainframe-based TCP streaming application that was used to saturate the CIP. The test implemented the server component on LPAR1 and the client component on LPAR2, while the router passed multiple streams of TCP traffic from one channel to the other. Using this technique, with multiple server/client instances, the team was able to saturate two ESCON channels.
Figure 8 IP Datagram Testbed Layout for Mainframe-to-Mainframe TCP Streaming Test
Using the same setup, the team ran multiple ping tests, sending two pings originating from the RSP4 to the mainframe, and setting the ping to no-response. This configuration enabled the RSP4 to saturate the channel (in one direction) with 4-KB pings. This test demonstrated the raw performance characteristic of the CIP, without being limited by higher-layer applications.Figure 9 shows the setup for the PC FTP test. As part of the performance tests, the team had four workstations capable of sending FTP traffic to the mainframe.
Figure 9 IP Datagram Testbed Layout for Client-to-Mainframe Test using Chariot
Test Procedure
The most difficult part of performance testing is understanding which component is the bottleneck. When testing raw throughput of a switch or router, the result is read by the counter on the ouput LAN interface. This result is achieved by sending IP packets in one LAN interface and out the other. It is important to ensure that the test generation can, in fact, transmit enough packets, and that it can count the arriving packets fast enough. The rest of the task is understanding and configuring the device under test (that is, the switch or router).Channel-attached router testing is very different. Many components need to be understood and analyzed:
•
ESCON channel
•
Mainframe mips
•
MVS TCP profile configuration
•
Traffic generation application (FTP or other)
•
Device under test
•
MTU sizes
•
Maximum Segment Size (MSS)
•
Client TCP configuration
•
Client adapter card performance
•
Segmentation
For this reason, the team ran different tests to show the impact of various testing methods.
Ping Test
The ping test was performed to overcome any application limitations. For example, various implementations of TCP and file transfer applications use different window and frame sizes. By using 4-KB pings with no response, the team tried to saturate the ESCON channel in one direction to see if the CIP CPU and ESCON adapter could cope with the amount of traffic offered. The use of ping is the only method of opening a large enough window and obtaining a sufficient packet-to-acknowledgment ratio. Unfortunately, no application could meet these testing requirements.TCP Streaming Application
To obtain a more realistic application-limited performance result, the team also tested the CIP performance using an MVS-based TCP streaming application supplied by one of its members. The test application closely approximated TCP file transfer software typically used by enterprise customers for bulk data transfers.In this test, the team configured various combinations of server and client instances across two MVS LPARs. This configuration resulted in multiple TCP streams of traffic being sent across the CIP. This traffic pattern is more realistic than the ping test because the application requires acknowledgments and, therefore, introduces some performance overhead. This effect can be seen when the single channel performance results for the ping test are compared with the TCP streaming application.
Test Results
As an analogy for the ping test, think of a car that can travel forever at unlimited speed with no concern for cornering, braking, traffic, or fuel. In this test, the channel was transmitting packets as fast as possible with no regard to acknowledgments. The test was considered complete when the ESCON channel utilization reached 100 percent.Figures 10-13, 16-20 and 22-24 show both CPU usage and the transfer rate for the various test conditions to illustrate the impact of the testing method. As shown in Figure 10, the maximum performance obtained was 12.383 MBps.
Figure 10 CIP IP Datagram Ping Test for One CIP
While transferring 12.383 MBps, the CIP CPU was only 31-percent utilized, indicating that it is certainly capable of sending 12.383 MBps across each ESCON channel, for a total throughput of 24.766 MBps.
Figure 11 shows the performance obtained when running 12 server and 12 client TCP streams from one LPAR to another LPAR on an HDS 280-mips mainframe. When this application overhead was introduced, the CIP was capable of transferring 18.4 MBps. This result represents a 27.5 percent decrease in performance due to application overhead (acknowledgments and so on), compared to the ping test at 24.766 MBps across two channels.
Note:
The decrease in performance was calculated as follows:
(24.766 MBps-18.4 MBps) / 24.766 MBps = 25.7 percentWhen testing performance across a channel-attached router, be aware of your application. The application can limit the total throughput by more than 25 percent if segmentation occurs and small window sizes are used.
Figure 11 IP Datagram Throughput Across Two ESCON Channels
The preceding tests evaluated the performance characteristics of the Cisco CIP with and without application overhead. What about the scalability of the total data center router? What was the impact on the main CPU of the router—in this case an RSP4 on the Cisco 7513?As shown in Figure 12, only 3 percent of the capacity of the main router CPU was used. Another performance metric to consider is the amount of bus bandwidth that the 18.4 MBps of traffic consumes. A Cisco 7513 router has two 1-Gbps buses, one to the left of the RSP4 and one to the right of the RSP4. In this case, 3 percent of the capacity of the RSP4 was used, and about 14.7 percent of capacity of one of the buses was used.
Note:
Percentage use of the 1-Gbps bus was calculated as follows:
18.4 MBps x 8 bits per byte = 147.2 Mbps x .001 = 0.1472 Gbps, or 14.7 percentFigure 12 IP Datagram Data Center Router Scalability
Conclusion
Performance tests that do not take these variables into account properly will produce results that are correspondingly limited or flawed. The results of the UAT's IP Datagram tests show that for TCP traffic, a single CIP with two ESCON channels can support 18.4 MBps of traffic throughput. In fact, with no application overhead, a single CIP with two ESCON channels can support 24.77 MBps.Testing in this environment is a very difficult task because many variables can indirectly affect the performance of the device. To ensure an adequate testing environment and to be confident of achieving correct results, the areas to watch include: configuring applications and test tools, setting frame sizes, understanding protocol flows, and configuring and tuning the devices under test.
As demonstrated by these results, the Cisco 7513 is far from being saturated. In fact, only a fraction of its power is used, leaving a significant resource for other router functions.
Extrapolating from the results, we can conclude that a single Cisco 7513 can support six fully loaded CIP interfaces, and five additional LAN/WAN interfaces (see Figure 13).
Figure 13 IP Datagram Scalability with Six CIP Cards
Test Plan—CSNA
Test Overview
The team was asked to develop a series of test scenarios that would provide baseline performance values for a CIP implemented in a production SNA environment. The team agreed that it would look at not only the raw data throughput numbers but also the total number of sessions (PUs/LUs) that the CIP is capable of supporting. The team also decided that it would observe the relationship between the raw data throughput and the number of active sessions, setting an upper-limit goal for the number of sessions to be tested. Performance limiting factors for this test included:
•
Application delay
•
PU-to-LU ratio
•
Session generation and consistent packets per session (pps) rate
The team decided that it would first validate Cisco's claim of supporting 6000 PUs with a test that would connect and maintain 6000 active PUs, each supporting one LU. A testbed was configured with 32 Wandel and Goltermann Dominos, split over two 16-MB Token Rings. The Dominos were selected because they can support up to 500 sessions and generate approximately 500 frames per second. This capability provided a means of generating the required number of sessions and a known performance number. To overcome the application delays, the HDS mainframe was configured with an ITPECHO application that would support 600 PUs. Thus the testbed required 10 copies of the ITPECHO application running, providing a fixed amount of application overhead. The goal was to keep the built-in overhead related to the mainframe and session generators consistent throughout the testing procedure.
The Cisco default for Logical Link Control 2 (LLC2) sessions is 256 because the CIP must allocate memory (from the memory pool of the CIP itself and not from the main memory of the Cisco 7513) for each LLC2 session (PU) that it will simultaneously support. With the CIP properly configured and 12 Dominos generating traffic, the team was able to establish 6000 PUs, supporting a total of 6000 LUs and switching 6200 pps. The CPU of the CIP was running at 83-percent utilization, while the RSP4 in the Cisco 7513 was at 6 percent, demonstrating that the Cisco 7513 was more than capable of routing the LAN traffic from the two Token Rings to the CIP, even though the CIP was at its performance limit.
Test Goal
This test had four goals:•
To verify whether the CIP can support the documented 6000 PUs
•
To examine the effect of a large number of LUs on the CIP performance
•
To quantify the total throughput of a CIP when running CSNA for bulk traffic
•
To determine the scalability of the data center router when implementing CSNA
Test Setup
To test a large number of PUs and LUs, W&G's Domino SNAGEN was used to simulate the PUs and LUs, and ITPECHO was run on the HDS mainframe. The Dominos were configured for a transaction profile and ITPECHO response to the request. Each Domino is capable of supporting 500 sessions and transmitting 500 frames per second.Figure 14 shows the setup for the PU/LU interactive test. The HDS mainframe was an eight-way 280-mips processor. To distribute the ITPECHO load on the mainframe, the team used 10 ITPECHO applications, rather than rely on one copy of ITPECHO to support the large number of PUs and LUs in the interactive tests.
Figure 14 CIP CSNA Interactive Testbed Layout
Figure 15 shows the setup for the throughput tests. In these tests, the team used the NetMarks application between two LPARs on the HDS mainframe to provide the necessary traffic load to saturate the CIP.
The NetMarks application provides a server and client component and can be configured to send any traffic pattern required. In these tests, the team used multiple instances of the server and client to achieve the desired traffic load. As far as the CIP cards were concerned, the traffic could have been originating from real remote PUs rather than between two LPARs. The advantage of this testing technique was that it leveraged the processing capacity of the large HDS mainframe, rather than requiring many remote PCs to act as remote PUs.
Test Procedure
To support 6000 PUs through a single CIP, two External Communications Adapter (XCA) major nodes were used. Each XCA supported 3000 PUs. For the 6000-PU test, only one LU per PU was configured in the switched major nodes.Each W&G Domino was configured with 500 LUs and 500 PUs and, therefore, 12 Dominos were required for this test. For the 16,000 LU test, each Domino supported 500 LUs across 100 PUs and, therefore, 32 Dominos were required.
Figure 15 CIP CSNA Bulk Traffic Testbed Layout
The devices were configured as follows:VTAM XCA Major Node Definition
CSNA Router Configuration
Switched Major Node (Partial)
Note:
Because of the size of the file and its repetition, the remainder is omitted for space. If you would like a copy of the entire file, please contact a member of the UAT.
Test Results
The goal of the test was to determine if a single Cisco CIP card can support 6000 PUs, as documented. The following router command is used to set the maximum number of PUs to 6000:max-llc2-sessions 6000
The test results confirmed that a single CIP can support 6000 PUs and can switch 6200 pps, as shown in Figure 16. The transaction profile used was 100 bytes in, 800 bytes out.
Figure 16 CIP CSNA 6000 PU Test Result
It is important to note that the RSP was only 6-percent utilized, showing that multiple CIPs can be installed in a data center router and the RSP capacity will not be a restriction. This capability demonstrates the scalability of the Cisco CIP solution. For example, a failure may require that a single data center router support 15,000 PUs. This level of support can be achieved using a single Cisco 7507 router with three CIP cards.
Figure 17 CIP CSNA Scalability Result
Figure 17 illustrates a scenario in which a single router supports an extremely large number of sessions. This solution might not be desirable in a production network, but it certainly would be valid in case of a failure. In other words, multiple data center routers can be used for regular production traffic, but if a failure occurs, the Cisco CIP solution is capable of supporting the extra sessions while the original failure is identified and resolved.
The scalability of the CIP is also important in a multiprotocol scenario. Support for IP Datagram and CSNA in the same data center router is a common requirement. The Cisco data center router solution has the flexibility to provide this support in two ways:
•
Mixed protocols on a single CIP
•
Mixed protocols on the router, with single protocols on each CIP
Figure 18 Data Center Router Implementing IP Datagram and CSNA
The maximum number of PUs is one measurement of the capacity of a CIP. Another measurement is the number of LUs that a CIP can support. It should be noted that the number of LUs represents the amount of traffic that the CIP passes. The number of PUs is limited to 6000, because the CIP must allocate memory for each LLC2 session (that is, for each PU). However, any number of LUs can be connected to each PU, and the number is irrelevant to the CIP. The real value to consider is the packets per second that all LUs combined place on the CIP.
As shown in Figure 19, the CIP was switching 6322 pps, and in this particular test, 16,000 LUs were generating this load (note that the RSP4 was only 7-percent utilized).
Figure 19 CIP CSNA 16,000 LU Test Result
Backing up large quantities of data from servers to a mainframe is typically viewed as "bulky" traffic (that is, it consists of large quantities of data that are often considered an entire subset of an application database). The previous CSNA tests specifically dealt with interactive traffic. In this test, the team tried to determine the maximum throughput that a CIP can attain when transferring bulky SNA traffic. It should be noted that this test was not completed because of time restrictions, and the results of the revised test will be published later. However, at 7.4 MBps, the CIP CPU was only 50 percent used (see Figure 20).
Figure 20 CIP CSNA Bulk Traffic Test Result
Conclusion
The CSNA testing has shown that a single CIP can support the advertised number of PUs—6000. Across these 6000 PUs, the CIP can support approximately 6200 pps. If the transaction profile is one packet in and one packet out (100 bytes in and 800 bytes out), these results equate to approximately 3100 transactions per second.For bulky traffic, it was shown that at 50-percent CIP usage, the CIP forwarded 7.4 MBps of traffic across two ESCON channels.
Test Plan—TN3270 Server
Test Overview
The team was keenly interested in TN3270 server functions. Each company represented on the team has a customer support organization that is required to provide quick response to customer inquiries, and each team member is responsible for supporting multiple call centers that traverse a large wide-area network before reaching the mainframe applications. With that in mind, the team decided to focus its testing on the number of sessions and transactions per second that the CIP could provide. As before, the team first identified any limiting factors for this environment and then proceeded to test its theories. As with IP Datagram, the team believed that the following factors could affect overall throughput, with the addition of the TN3270 server itself and the access router that the TN3270 data must pass through:•
Average packet size and MTUs
•
Application window size (or acknowledgments)
•
I/O requirements
•
TN3270 server
•
Data center router
The team decided that its goal would be to generate enough TN3270 traffic for one of the measurable devices to reach 100-percent utilization. To achieve this goal several tests were performed—one to verify the TN3270 server functionality of the CIP at 1000 LUs and another to verify functionality at 16,000 LUs. The team would then repeat the 16,000-LU test but with two CIPs running TN3270 server functions, for a total of 32,000 LUs within a single Cisco 7513 router. This final configuration would test both the scalability of the CIP and the capabilities of the Cisco 7513 to support multiple TN3270 servers.
The next task was to generate 16,000 LU sessions with a known traffic profile. The team acquired a TN3270 traffic generator capable of generating 16,000 sessions with a message size of 100 bytes in and 800 bytes out and a transaction rate of approximately 850 transactions per second. The traffic generator communicated with ITPECHO, which echoed the traffic back to the TN3270 generator. The team agreed that it would allow all the sessions to be established before taking any measurements to avoid any session startup and VTAM overhead that could skew the throughput numbers before a steady state was reached.
The first test examined peak performance of the TN3270 server in a common data center situation. The testbed included 1000 LU sessions connected via a WAN that required TN3270 access to the mainframe. The team verified via both the CIP and the mainframe that the traffic generator had established 1000 sessions before collecting performance data. In this test, the team found that the CPU of the CIP was operating at 100-percent utilization while passing 855 transactions per second. The CIP was the limiting factor, while the RSP4 was only operating at 2-percent utilization, even though it was being used to route all the test traffic from the generator to the CIP.
With these base results in hand, the team restarted the traffic generator, this time to establish 16,000 LU sessions. Again, the CIP was the limiting factor—operating at 100 percent CPU utilization—but this time the transaction rate dropped to 711 transactions per second. The team concluded that the decrease was probably due to the change in data stream between the traffic generator and the mainframe, because even though the CIP was supporting 16 times as many LU sessions, the ESCON utilization remained the same. This result could indicate that the CIP and mainframe were utilizing the channel the same as before, but with a different packet mix. This theory was also supported by the fact that the utilization of the RSP4 decreased to less than 2 percent with the higher number of sessions, again indicating that a change in the traffic flow rather than the increase in LU sessions was the major reason for the decrease.
The final test, which passed 32,000 LUs through a single router, was critical in verifying the real scalability of the CIP. The team's goal was to confirm that by using the processor of the CIP itself to handle the TN3270 server functions, the RSP4 of the Cisco 7513 would be free to solely route packets and that it had the built-in capacity to support multiple TN3270 servers. Again, the limiting factor was the CPUs of the CIP and not the RSP4. The results showed that by using the onboard CPU of the CIP, the team could easily add capabilities to the data center router and that the throughput of one CIP was not affected by the performance of the other CIP. Instead, performance doubled. It is worth noting that the measurements were consistent for all the tests, indicating that the tests were operating at the limits of the TN3270 Server at all times.
With these results, the team was convinced that with its onboard CPU and memory, the CIP is by far the most flexible TN3270 solution available today. Additional users could be supported by installing another CIP into the data center router—without requiring additional routers or WAN support.
Test Goal
The goal was to test the performance of the Cisco TN3270 Server and obtain two performance measurements:•
Maximum transactions per second
•
Maximum number of TN3270 LUs
Test Setup
The test was configured to verify the ability of a single CIP to perform TCP/IP downstream and SNA upstream. All tests used the DLUR feature of the TN3270 Server. Figure 21 shows the testbed layout. A TN3270 session generation tool was used to establish the desired number of TN3270 sessions, and the router under test performed the TN3270 server function. Two tests were run, one with 16,000 sessions to a single CIP, and another with 32,000 sessions across two CIPs. ITPECHO was used as the host application that echoed back the traffic presented by the TN3270 traffic generators.Figure 21 TN3270 Testbed Layout
Test Procedure
To produce the large number of TN3270 sessions required, a TN3270 session generator was used. The generator was able to generate 16,000 LU sessions while controlling the traffic profile (that is, message size in, message size out, and transaction rate). The following configuration was used:
Note:
Because of the size of the file and its repetition, the remainder is omitted for space. If you would like a copy of the entire file, please contact a member of the UAT.
Test Results
The first test determined the maximum transaction rate that a single CIP can support while performing the TN3270 server function. At a CIP CPU utilization of 100 percent, the CIP switched 855 transactions per second (one packet in, one packet out), as shown in Figure 22. The transaction profile used was 100 bytes in, 1000 bytes out. Note that because the CIP TN3270 Server performs almost all the processing in this scenario, the RSP4 was only 2-percent utilized.Figure 22 CIP TN3270 Server Output for 1000 LUs
Figure 23 shows the impact of running a large number of TN3270 sessions. Cisco has documented that a single CIP can support 16,000 TN3270 sessions. This claim was tested and confirmed. Note that the maximum transactions per second dropped from 855 to 711 because of the increase in LUs (a 16.8-percent decrease, with an LU increase from 1000 to 16,000).
Figure 23 CIP TN3270 Server Test Results for 16,000 LUs
The final test examined the CIP's TN3270 Server's scalability. Two CIPs in the same router were fully utilized. Each CIP card supported 16,000 sessions and approximately 711 transactions per second, providing a total data center router performance of 32,000 TN3270 sessions at approximately 1500 transactions per second (see Figure 24).
Figure 24 CIP TN3270 Server Scalability
Conclusion
The TN3270 Server tests proved that a single CIP can support 16,000 TN3270 LUs and switch 711 transactions per second. When supporting fewer TN3270 LU sessions, as many as 855 transactions per second can be supported on a single CIP.Because of the architecture of the CIP, TN3270 Server can scale very well within a single data center router. As was shown in the scalability test, two fully configured CIP cards, each supporting 16,000 TN3270 LUs, can be supported by a single center router. In fact, more than two fully configured CIPs could be supported because the RSP still has processing capacity remaining.
Combining the results of these tests with those obtained from the IP Datagram and CSNA tests shows that the Cisco data center router is capable of supporting the following concurrently:
•
One CIP running IP Datagram and transporting 18.4 MBps
•
One CIP running CSNA and supporting 16,000 SNA LUs and 3200 transactions per second
•
One CIP running TN3270 Server and supporting 16,000 TN3270 sessions and 711 transactions per second
A single data center router (for example, the Cisco 7507) can support these configurations because the major processing load for each of the CIP features is placed on the CIP CPU. The main router CPU, the RSP, simply switches the packets from LAN or WAN interfaces to the CIP.
Appendix
The following information is the exact configuration files that were used to perform the tests—the MVS TCP/IP for MVS profile for the IP Datagram test, the XCA Major Node definitions and NetMarks files for the CSNA and TN3270 Server tests, and the Cisco configuration used throughout the tests.
XCA Major Nodes
FMXCA30A VBUILD TYPE=XCA
FM30PRTA PORT ADAPNO=0,CUADDR=4100,SAPADDR=04,MEDIUM=RING,TIMER=30
*
FM30GRPA GROUP ANSWER=ON,
X
AUTOGEN=(3000,F,J),
X
CALL=INOUT,
X
DIAL=YES,
X
ISTATUS=ACTIVE