A Year in Review: HPC Success Stories
By Gilad Maayan
 

High performance computing (HPC) technologies can quickly perform parallel processing tasks. In the past, HPC resources were expensive to set up and maintain. However, once HPC moved to the cloud, it became more accessible and affordable. In this article, you will learn about three successful cloud HPC implementations.

globe

Image Source

What Is HPC?

High performance computing (HPC) technologies aggregate resources for the purpose of performing parallel processing tasks. This allows users to distribute large, computationally intensive workloads across resources. Organizations use HPC technology to run highly complex workloads, such as machine learning training and big data analyses. HPC performance takes much less time than regular computing.

HPC technologies leverage central processing units (CPUs) and graphical processing units (GPUs) to provide a high level of efficiency and performance. GPUs are specialized units that are used to perform certain tasks in a process. GPUs supplement the processing power of the CPUs. Each HPC solution has between 16 to 64 clusters of computers, each with two or more CPUs that have multiple cores. Nodes share memory and storage resources across the system.

Three Clouds, Three HPC Success Stories

Traditionally, HPC solutions are hosted on-premises by enterprises with significant server space and resources. However, HPC hosted on cloud services is gaining in popularity. This is because cloud hosting can be accessed on an as-needed basis, making HPC more accessible to more organizations. Below are three examples of organizations that have taken advantage of HPC in the cloud.

CGG: Resource Hunting With Global Scalability

CGG is a geophysical services company that images the Earth’s subsurface for the oil and gas industries. CGG provides organizations with systems and sensors for subsurface monitoring onshore, offshore, on seabeds, and downhole. They also perform geophysical surveys and characterize reservoirs.

The challenge

Typically, CGG provides technologies that are hosted on-premises but at least one client, an oil and gas company in Southeast Asia, wanted to use the cloud. This company partnered with CGG to optimize internationally used reservoir characterization technology. They used Azure HPC resources to minimize hardware costs and enable greater distributed access.

Both CGG and this company were aware that cloud resources were being used by other organizations for oil exploration. However, it was unclear whether cloud resources would provide the same performance as on-premises.

Benchmarks and benefits

Typical geoscience analyses use around 15GB of data and include millions of seismic data traces. When processed on-premises a single analysis could take a day or two. However, the company wanted to increase this to 30 per day by leveraging the larger number of computers that could be run in a clustered cloud system.

As an initial test, a single realization was run on one HPC node and a result was returned in under 12 hours. They then ran eight simultaneous tests on eight nodes with roughly the same result. As a final test, they boosted the number to 30 analyses on 30 nodes. The result was finished in just over 12 hours.

Based on these results, the company determined that Azure was able to provide the scalability and performance needed to meet their goals. By accessing cloud resources they were able to scale as needed without having to purchase new hardware and without sacrificing global accessibility.

Celgene: Medical Research at Life-Saving Speeds

Celgene is a biopharmaceutical research company developing drug therapies for inflammatory disorders and cancers. It is centered in the United States but employs over 8k globally distributed employees. Celgene operates hundreds of clinicals trials in major medical centers all over the world.

The challenge

Pharmaceutical research is incredibly expensive, with failed clinical trials leading to billions of dollars in losses. To avoid these losses, companies like Celgene must be able to accurately analyze compounds and identify those with a high probability of success. However, these analyses are highly complex and can be very slow to perform. Celgene’s goal was to automate some of these analyses and to increase the efficiency of computation. They used AWS-based HPC resources.

Benchmarks and benefits

Celgene started by migrating just two of its existing HPC workloads to AWS. As a result, they saw an improved ability for their teams to collaborate, increased scalability, and a decrease in the total cost of ownership. Based on these results, the company decided to try moving more workloads.

In total, Celgene was running over 1k applications on their on-premises resources. All applications needed to remain highly-available and data integrity was key. To accommodate these needs, the company chose to migrate its R&D and application data to Amazon Elastic File Service (EFS). This included HPC workloads, such as genomics-computing and computational chemistry. They then attached EFS to HPC nodes in AWS, scaling up to hundreds of nodes at a time.

As a result, Celgene was able to significantly increase the number of queries its researchers were able to perform. They were also able to ensure that analyses were performed on centrally available data and speed processes by increasing computational power and reducing latency. These improvements removed some of the research restrictions previously created by resource limitations, enabling more innovative research.

eSilicon: Partnering Machine Learning with Cloud HPC

eSilicon, which is now a part of Inphi, is a company that produces semiconductors and application-specific integrated circuits (ASICs), which are applied in a wide range of fields, including artificial intelligence, 5G infrastructure, and networking.

The challenge

eSilicon wanted to migrate their workflows for integrated circuit design and semiconductor engineering to the cloud. To do this, they needed to set up cloud resources that were globally distributed and accessible. They also wanted to increase computing power to allow for the integration of machine learning guidance for workloads. To accomplish this, eSilicon chose to move their operations to HPC resources in Google Cloud Platform (GPC)

Benchmarks and benefits

eSilicon started its migration by moving data and virtual machine (VM) infrastructures. This required the migration of complex electronic design automation (EDA) storage. It also involved moving their VMs, to committed, preemptible, sustained use cores.

For networking, eSilicon needed distributed, low-latency access across nine countries. This was accomplished through partnerships with Google supported vendors, Citrix and Silver Peak. Citrix created the virtual desktop application framework for eSilicon engineers to work from and Silver Peak optimized network and distribution models.

As a result, eSilicon’s total cost of ownership decreased, and their ability to collaborate increased. The company was also able to incorporate AI and machine learning code into its workflows, which it wasn’t able to do in on-premises resources. This integration enabled their engineers to better understand compute requirements for processes and to optimize jobs on a granular level.

Conclusion

Cloud HPC enables organizations to gain access to affordable processing resources for complex and resource-intensive operations. For a company like CGG, which provides geoscience analyses for gas and oil industries, HPC means cost savings and cloud availability.

Celgene, which is a biopharmaceutical research company, is leveraging the power of cloud HPC for analyses automation and reduced costs. For eSilicon, which produces semiconductors and ASICs, moving to cloud HPC meant improved collaboration and a decrease in total cost of ownership.

While each company used cloud HPC in a unique way, all three managed to gain similar benefits—improved collaboration, better scalability, high processing capabilities, and cost savings. These success stories show that, if implemented right, cloud HPC can provide organizations with the resources needed to continually improve their services, increase productivity, and reduce costs.

Want more tech news? Subscribe to ComputingEdge Newsletter today!

——————–

Author Bio: Gilad David Maayan

Gilad David Maayan is a technology writer who has worked with over 150 technology companies including SAP, Imperva, Samsung NEXT, NetApp and Ixia, producing technical and thought leadership content that elucidates technical solutions for developers and IT leadership. Today he heads Agile SEO, the leading marketing agency in the technology industry.

LinkedIn: https://www.linkedin.com/in/giladdavidmaayan/