Partnership between HPC Park and RCloud by 3data

05 / 07 / 2023

Within collaboration with HPC Park RCloud by 3data has integrated a GPU- accelerated service for AI tasks into the cloud platform

The RCloud by 3data platform in collaboration with HPC Park launched a new GPU service to speed up complex computations in the cloud. With the new cloud aggregator service, its customers will be able to increase an order of magnitude larger the processing speed of large data arrays.

The HPC Park Cloud Service is a container-based platform that can handle complex tasks in high-performance computing (HPC), machine learning (ML/DL) and artificial intelligence (AI). Nvidia Tesla A100 server accelerators are presented in the containers. The hardware of the platform is located in the Moscow data center with duplicate communication channels and backup power.

“The RCloud by 3data cloud platform offers a unique stack of solutions for automation of IT business infrastructure, which we are constantly expanding with the new services. We see that more and more companies are embracing the use of machine learning and AI, so we believe that currently the specialists need to think about the expediency of implementing GPU calculations in their business processes,” said Valentin Sokolov, IT director of the RCloud by 3data cloud platform.

HPC Park Cloud Service provides containers with ready-made software environment and popular frameworks for ML: Pytorch, Tensorflow so that Data Science and Big Data specialists are able to get started quickly in the familiar to them environment.

Service with GPU accelerators for AI tasks integrated into RCloud by 3data cloud platform.

The platform was upgraded with Nvidia Cuda in the last release. This technology assumes minimal environment without installed frameworks, which can be complemented by any convenient software environment. Customers also have the ability to attach the network storage and switch it between containers, allowing them to combine the containers into a single network for horizontal scaling. Interaction with containers is possible by Jupyter Lab or SSH.

Preservation of container condition works through mounting Ceph volumes, eliminating the need to load and download stored containers. The file system is mounted directly from the network store to any point of the cluster.

The distinctive feature of HPC Park Cloud Service is the working MIG technology for sort of virtualization of the physical GPU card. The Server GPU (A100, H100) supports the Multi-Instance GPU (MIG) and has seven independent units in one GPU. Each of the instances is fully isolated, has high-speed memory, cache and its own computing cores. As part of the new service with GPU, containers with 1/7, 2/7, 3/7 and further up to 7/7 fractional parts of the whole Tesla A100 physical card are available to the customer. Fractional parts of an entire card are used to reduce costs or for less resource-intensive tasks, such as where game accelerators are commonly used.

“This service is already available on the RCloud by 3data cloud platform, where existing customers can manage it in a single control window together with other services. Resources (GPU, “Storage”, “Networks”) can be scaled from the Management Console at any time, as well as combined with additional services such as virtual machines and S3 Storage within the RCloud by 3data platform. All customers can be given exclusive access for free trial and the team of specialists around the task,” said Andrey Selikhov, Sales Director of HPC Park.

All articles