Oser Communications Group

Page 5 of 19

S u p e r C o m p u te r S h o w D a i l y W e d n e s d a y, N o ve m b e r 2 0 , 2 0 1 3 6 PARALLEL PROCESSING USING SCALABLE LEARNING MEMORY HARDWARE By Bill Nagel, Vice President, Cognimem Technologies Inc. Alternative processing architectures exist for handling embarrassingly parallel algorithms. Of note is machine learning that makes use of kNN (k- Nearest Neighbor) and RBF (Radial Basis Function) nonlinear classifiers. These classifiers – commonly used in the indus- try – have proven useful, and are the best approach for a wide range of pattern matching (e.g. SIFT/SURF/HOG vector analysis, data mining, general purpose pattern recognition, cyber security, video analytics, etc.) and more. These algo- rithms can be coded and executed on tra- ditional hardware as is commonly done, but can also be reduced directly to silicon integration as memory based processing elements that dramatically improve per- formance/watt and scalability by orders of magnitude. These natively implement- ed hardware classifiers avoid the system hardware and software complexities of parallel serial processing, while simulta- neously eliminating the von Neumann processing/memory bottleneck. Machine Learning in a commercial available component, and a 40K (40 devices of 1024 parallel processing ele- ments each) processing storage system is available for proof of scalability and application level benchmarking. >1 Teraop (compare, multiplex, subtract, accumulate, search and sort) of equiva- lent integer performance under 10 watts is achieved. Each processing element learns by storing digital data up to 256 bytes and compares these stored models to a broad- casted vector to be fuzzy or exact matched against. The learning process can be in situ real time, updated at a later date to include new knowledge or pre- loaded from previous training offline. This architecture is set up as a three-layer network. A modular board incor- porating four of these com- ponents connected through a dynamically reconfigurable Lattice XP2 FPGA was built. This architecture provides the customer with the flexi- bility of creating user- defined interconnect topolo- gies as required by applica- tion constraints. Local fast nonvolatile MRAM (Magneto-resistive Random Access Memory) is also provided for fast local loads of pre-trained datasets during system power-up. Conversely, the same local storage can be used to store large datasets to be compared to a single pat- tern loaded into the system at runtime at real-time speeds. A vertical connection through a "spine" of up to 10 modules is also incor- porated. This flexible architecture can be configured to have all processing ele- ments working together on a common problem (like finding one iris or finger- print among 40K with very low latency) or subsets of the same problem (like searching for an anomaly in an image). The processing element that finds the closest match alerts the user in a fixed time regardless of the number of data vectors being searched against, pointing towards unprecedented low latency performance for large data base applications. Future PCIe compatible boards are planned. Host communications for handling the throughput requirements can be provided by various standard computer peripheral technologies via the component's 16-bit parallel bus. Each module comes with a USB device con- nector and four LVDS cardinal connec- tors, providing countless connectivity options. There is a full-featured SDK available with full source code examples for Android, C++, .NET, Java, Matlab and Python, on both Windows and Linux platforms to help accelerate the adoption of this technology. During SC13, visit Cognimem at booth 3609. For more information, visit www.cognimem.com, call 916-358-9483 or email info@cognimem.com. UNLOCKING BIG VALUE FROM BIG DATA By Bill Dunmire, Sr. Director Product Marketing Today's enterprise is inundated in data. From Internet traffic, sensors and credit card activity, to social media, video mon- itors and more, data is arriving in vol- ume, with higher velocity and greater variety than ever experienced. What has forward-looking businesses and govern- ment agencies focused on "Big Data" is the insight that can be derived through analytics. Marketing can better under- stand buying behavior to secure more customers. Manufacturing can better understand product performance to improve quality. Finance can better iden- tify fraud to prevent loss. Physicians can utilize genetic profiles to treat patients more effectively. From competitive advantage and topline growth to saving lives, the potential gains surrounding Big Data are far reaching, but also present new challenges: how to derive value at greater speed, scale and efficiency. Designing an enterprise-class data analytics solution begins with enterprise- class infrastructure. The type of infra- structure that best meets the needs of the business generally depends on two prin- ciple metrics: data relationships and time to value. Relationships Matter Let's begin with relationships. Are they sufficiently understood so that a data set can be broken into chunks and analyzed in parallel across a computer cluster? Say for example, ranking the top selling prod- ucts across a given time period. Will parsing the data into an organized struc- ture add value such as to facilitate fur- ther, ongoing analysis? Or should data be analyzed all at once versus in chunks, in order to discover hidden relationships? So Does Speed Time to value equates to speed of analy- sis. Do business needs allow hours to days for results? Will reducing to minutes add value? Or are real-time results (microseconds) the desired objective? With these two metrics defined, the appropriate High Performance Computing (HPC) infrastructure for Big Data analytics can now be designed. For analysis in chunks and hours to results, a Hadoop cluster is ideal. To provide orga- nizational structure and accelerate data analysis, deploy a cluster utilizing a NoSQL database. And for knowledge discovery and immediate time to value, utilize a robust shared memory system. Building on expertise developed during two decades of delivering several of the world's fastest supercomputers, and many years of experience helping customers efficiently manage high-vol- ume data environments, SGI enables business and government agencies to perform Big Data analytics with faster and greater insight, provide extreme capacity and scale needed for Big Data storage, and lower costs. Learn more about SGI Big Data Solutions at www.sgi.com/bigdata. Visit SGI at booth 2709 during SC13. For more information, visit www.sgi.com, call 800-800-7441 or email laura_clark@sgi.com. SILICON MECHANICS: CHANGING THE WORLD WITH HIGH-PERFORMANCE COMPUTING CLUSTERS By Sue Lewis, Chief Marketing Officer, Silicon Mechanics I don't think there are too many people out there who would disagree with the statement that there is an essential need for state-of-the-art computing in next- generation research. Silicon Mechanics has a longstanding commitment to sup- porting educational and research institu- tions in their quest to gain computer resources to help them work on their great research ideas. One way we put this into practice is by awarding a high-performance com- puting cluster in Silicon Mechanics' 3rd Annual Research Cluster Grant Competition, officially launching at SC13. The competition is open to all U.S. and Canadian qualified post-secondary institutions, university-affiliated research institutions, non-profit research institu- tions and researchers at federal labs with university affiliations. This year's generous sponsors are Intel, NVIDIA, HGST, Mellanox Technologies, Supermicro, Seagate, Kingston Technology, Bright Computing and LSI Logic. The HPC cluster, worth more than $118,000, contains eight compute nodes equipped with Intel Xeon CPUs, Intel Xeon Phi coprocessors, NVIDIA Tesla GPUs, HGST s800 Series Enterprise SSDs, Mellanox ConnectX-3 InfiniBand HBAs and a SwitchX-2 FDR switch, Seagate Savvio 10K.7 SAS hard drives, and an LSI MegaRAID SAS controller. The Research Cluster Grant was the brainchild of Art Mann, the Education/Research/Government Vertical Group Manager at Silicon Mechanics. Art knows the typical challenges encountered by research grant applicants in com- peting for constrained resources. He designed the Research Cluster Grant application to resemble a more traditional grant application process – but much easier. From Art's point of view, this Silicon Mechanics program is meant to support researchers who may have ideas that just don't seem to fit the existing research mold for a given field, which frequently means they have no or limited access to some of the high-performance computing resources that could further their research. Many of the grant applications included collaborations, either cross- departmental, cross-functional or across multiple institutions. For example, the winner of the 2011 research cluster grant competition, Saint Louis University, is using the cluster on seven separate research projects through ten different academic departments. Last year's grant was won by Tufts University, and included col- laboration with the biology, com- puter science, biomedical engineer- ing and mathematics departments, as well as the Universidad de Sevilla, in Seville, Spain. Researchers at Tufts will be using their cluster as a key component of an exciting, multidisciplinary effort to trans- form the way biological pattern forma- tion is investigated. Their long-term mis- sion is to integrate computer science, molecular biology and biophysics to understand the processing of patterning information in living systems. Information, applications and com- petition instructions on the 3rd Annual Research Cluster Grant Competition will be available in November 2013 at www.researchclustergrant.com. During SC13, visit Silicon Mechanics at booth 3126. For more information, visit www.siliconmechanics.com, call 425-424-0000, or email info@silicon mechanics.com.

Articles in this issue

Links on this page

view archives of Oser Communications Group - Super Computer Show Daily Nov 20 2013

Super Computer Show Daily Nov 20 2013

Contents of this Issue

Navigation

Page 5 of 19

Articles in this issue

Links on this page