Pennsylvania State University's Academic Services and Emerging Technologies (ASET) unit of Information Technology Services (ITS), designs, develops and operates the information technology infrastructure that supports students, faculty and staff in their teaching, learning and research endeavors. Within ASET/ITS, the High Performance Computing (HPC) group helps researchers compute and manage data by developing and maintaining several state-of-the-art computational clusters in a central laboratory on campus.
The HPC group's latest cluster, LION-XO, provides students and faculty with cost-effective, general-purpose parallel computing cycles. To interconnect the 80-node cluster, the HPC group needed the predictable, non-blocking performance and reliability of the Force10 E600 switch/router.
According to Vijay Agarwala, director of the High Performance Computing and Visualization group, Penn State previously depended on mainframe-class machines to handle the demands of its research applications before turning to the cluster model to cost effectively increase performance and scalability. The group's four clusters leverage the combined computational power of a number of advanced computers to enable research using programming languages, numerical libraries, statistical packages and computational codes for several academic disciplines. The group helps researchers with code optimization and parallelization. The clusters also connect with other high performance systems around the world for sharing data and solving even larger problems.
The newest cluster, LION-XO, currently has 80 computing nodes and is expected to grow to 160. The system features 80 Opteron processor-based Sun Microsystems SunFire servers, each with eight Gigabits of random access memory (RAM) and 73 Gigabytes of Ultra SCSI disk storage. To better meet the application needs of its users, the HPC group deployed two interconnect switches: an Infinicon Systems' Infiniband switch and a Force10 E600 Gigabit Ethernet switch.

Traditionally, the HPC group has deployed costly proprietary switching fabrics to circumvent the latency issues associated with Ethernet technology. However, with cost increasingly becoming a concern, Gigabit Ethernet is an attractive alternative for applications that are not latency sensitive.
"The cost/performance advantages of Gigabit Ethernet over proprietary solutions are significant, making it a cost-effective alternative for many applications that demand high throughput but can sustain small amounts of latency," said Agarwala. "In many cases, the superior performance of Gigabit Ethernet can be used to better utilize our resources."
Typically, computational clusters are fully utilized only 20 to 30 percent of the time, according to Agarwala. The group's clusters, however, are far busier with utilization exceeding 90 percent, making it essential that the interconnecting switches are extremely reliable.
With such high usage and more than 200 students and faculty relying on the cluster as a resource, any unplanned downtime can be catastrophic and extremely costly in terms of lost or delayed research. At a time when research dollars are being stretched, reliable performance that enables high utilization results in savings that the HPC group can apply to other development efforts.

The High Performance Computing and Visualization group at Penn State is intent on providing a world-class network that enables students and faculty to conduct research using cutting-edge technology. To meet that goal, the group required its cluster infrastructure to deliver not only non-blocking throughput, scalability and port density but also a high level of reliability to support nearly 100 percent utilization at times. The Force10 E600 delivered for the group, on every count.
"For the LION-XO cluster, our requirements for Gigabit Ethernet forming the core of a scalable and highly available interconnect technology infrastructure made a lot of solutions unsuitable," said Agarwala. "The Force10 E600, however, met every one of our stringent demands, proving that we could indeed capture the cost advantages of Gigabit Ethernet without compromising the performance or availability of our cluster."
For the High Performance and Visualization group, the reliability of the Force10 E600 was the deciding factor. As the interconnecting switch/router in the LION-XO cluster, if the E600 went down for whatever reason, the entire cluster would go down, preventing 200 students or faculty from leveraging the processing power of the network to conduct their world-class research.
"It's all about achieving maximum network uptime, and the Force10 E-Series has several features that make it highly reliable, basically ensuring that it is never a point of failure in our cluster," said Agarwala. "Its superior scalability and resiliency support our high utilization times, and its non-blocking throughput always ensures maximum performance."
The Force10 E600 features a fully distributed architecture that separates switching, routing and management functionalities. With protected memory and processing power for each function, the E600 ensures predictable performance even in the face of denial of service attacks. Built-in redundancy of all key components, including switch fabrics, power supplies and route processor modules, and Force10’s hitless failover technology ensure that the Force10 E600 continues to forward traffic in the event of a failure with zero packet loss.
While the LION-XO cluster initially features 80 nodes, the HPC group expects to double its size in the near future. To eliminate a costly forklift upgrade as the cluster expands, Agarwala needed the high density of the Force10 E600 to build a Gigabit Ethernet infrastructure that could scale to accommodate the increase in processing power.
"In the past, our largest cluster has reached 176 nodes, which the Force10 E-Series can scale far beyond," said Agarwala. "The density advantages the Force10 E-Series delivers opens our minds to the possibilities as we continue to build new high-performance clusters with greater processing power."
To pursue cutting-edge research, the High Performance Computing group at Penn State required a high performance network. The Force10 E600 delivered the performance, reliability and scalability demanded, enabling the group to deploy Gigabit Ethernet in its most recent and state-of-the-art cluster.