Oser Communications Group

Super Computer Show Daily Nov 20, 2014

Issue link: http://osercommunicationsgroup.uberflip.com/i/412890

Contents of this Issue

Navigation

Page 0 of 11

By Art Mann, Education/Research/Gov't Vertical Group Manager, Silicon Mechanics As Hurricane Sandy hit the East Coast in 2012, most people in the area were concerned with things like emergency food and water supplies and impending power outages. The laboratory director of a biomedical research facility in New York City probably worried about those things as well, but also had another thing in mind: the facility's high-perform- ance compute cluster. As a facility devoted to catalyzing transformative changes in bio- medicine through breakthrough computational methodological research, losing the cluster itself would bring research to a halt, and losing the data stored on the cluster's hard drives would be even more catastrophic. The flood waters advanced in the facili- ty, and while the cluster itself was ultimately lost to storm damage, the invaluable data on the cluster's hard drives wasn't: the lab director unracked the drives and rescued them before the flood could destroy them. Following the storm, the research center faced the challenge of getting their research back on track as quickly as possible. The center's work includes three very different High Performance Computing (HPC) use cases: researchers assembling a new genome, requiring compute-intensive resources with huge amounts of memory and many memory nodes; researchers studying disease transmission and developing statistical models, requiring up to thousands of CPU cores; and researchers collect- ing data coming out of DNA sequencers, requiring cores that connect to the data with I/O throughput. As a result, not only was a fast turnaround on a new HPC cluster cru- cial, but a custom design maximizing usability for each of these very different SILICON MECHANICS HIGH PERFORMANCE COMPUTER CLUSTER ALLOWS PROJECTS AT BIOMEDICAL RESEARCH FACILITY TO THRIVE Behold ASRock Rack's 3U8G-C602, a 3U rackmount server barebone with a 1200W platinum redundant PSU (3+1)! This GPU optimized machine has recently passed NVIDIA Tesla qualification, proving itself to be one of the most powerful systems for computationally intensive tasks. 3U8G-C602 is put together around Intel's C602 chipset, geared with two LGA 2011 CPU sockets, 16 DDR3 DIMM slots, 6 SATA3 6.0 Gb/s connectors, dual Intel gigabit LAN ports plus an additional mezzanine card slot for adding 10G Ethernet to the already exciting blowout. Moreover, this server platform supports up to 8 GPGPU cards with the four edge connectors on the side of the board and another system switch board, so that customers may have a clean system with components all sorted out in orderly fashion. Of course, if anyone is thinking of a Haswell-EP series CPU upgrade, ASRock Rack has also got you covered with an Intel C612 chipset successor 3U8G-C612. 3U8G-C602 Brief Specifications: 3U Rackmount with 1200W Platinum Redundant PSU (3+1), Dual Socket LGA 2011, supports DDR3 1600/1333/1066 16 x DIMM, supports 8x GPGPU/MIC card, supports Intel Dual GLAN and additional Mezzanine card to support 10G Ethernet, and supports 6 x SATA3 6.0 Gb/s (Hot Swap Bay). For more detailed information, please visit www.asrockrack.com/general/productde- tail.asp?Model=3U8G-C602 and www.asrockrack.com/general/productdetail .asp?Model=3U8G-C612. An interview with Rod Wilson, Senior Director, External Research, Ciena. SCSD: Why is it important for vendors to collaborate with the R&E community? RW: More than ever, collaboration is crucial to ensuring survival in a very volatile telecom market. Collaboration drives the exchange of valuable resources like material goods, ideas and people, which in turn spurs innovation. This is what all research and development teams should strive for in any industry. This truth has led Ciena to pursue the formation of meaningful, mutual relationships with some of the world's premier researchers and educators across the globe. SCSD: What does Ciena gain from collaborating with the R&E community? RW: The benefits of collaborating with the R&E community and its people are clear – these partnerships provide employees a bridge that links to high quality researchers who can provide valuable new insights through highly specialized research. The R&E community is a great resource as they are often the first to both test and quali- fy new technologies that are not yet commercially viable. The community is unique in that it can provide validity of new ideas through proof of concepts, the develop- ment of demonstration networks and test beds that can lead to product trials. Collaboration with high quality researchers and educators provides vendors opportu- nities that otherwise may be considered an inefficient use of resources as these One Stop Systems (OSS) designs and manufactures compute acceleration systems that hold GPUs or Coprocessors and Flash storage arrays that hold Flash storage cards for a variety of HPC applications. OSS was first-to-market with PCIe over cable, the essential high-speed connection between host servers and the expansion accelerators. It is a leading producer of PCIe x16 3.0 cable adapters and expansion units. At SC '14, OSS is showing the Compute Accelerators and the Flash Storage Arrays. Its 3U High-Density Computation Accelerator (HDCA) holds up to 16 NVIDIA Tesla GPUs or Intel Phi Coprocessors pre-installed and tested. Its Flash Storage Array (FSA) is also a 3U rackmount unit, and can hold up to 32 flash memory cards. When configured with the latest SanDisk Flash cards, the FSA pro- vides 200TB of high-speed storage with performance up to 17 million IOPs. OSS partners with HPC industry leaders to bring these new products to mar- ket. This has the added advantage of full testing with the desired server before shipping. The company's partners include Intel, Supermicro, IBM, Exxact, Penguin and SGI. It is also working closely with NVIDIA, Intel and SanDisk to ensure full product compatibility with their add-in cards. The primary alternative to the OSS external system approach is to add the GPUs or flash add-in cards into a server. For one or two card implementations this approach can be cost-effective. However, as additional add-in cards are needed, this internal approach often breaks down. Servers tend to be constrained by the number of slots, power and cooling capacity. Pushing servers to the limits can result in less Continued on Page 9 Continued on Page 9 Continued on Page 9 Continued on Page 9 SERVER BAREBONE 3U8G-C602 HAS PASSED NVIDIA TESLA QUALIFICATION TEST BEDS AS A VEHICLE FOR DISCOVERY OSS: EXPANDING THE LIMITS OF HPC O s e r C o m m u n i c a t i o n s G ro u p N e w O rl e a n s Th u r s d a y, N o ve m b e r 2 0 , 2 0 1 4 AN INDEPENDENT PUBLICATION NOT AFFILIATED WITH SC

Articles in this issue

Links on this page

view archives of Oser Communications Group - Super Computer Show Daily Nov 20, 2014