How AI Simplifies Corporate Work. Experience of HPC Park and "Neiroseti"
16 / 12 / 2024
How AI Simplifies Corporate Work: Experience of HPC Park and "Neiroseti".
In the last issue of the CIS magazine, we shared the experience of company «Neiroseti» with HPC Park Cloud Service platform.
As part of the successful collaboration between HPC Park and the company "Neiroseti," one of the features of the Index 5 service is the automatic generation of concise and structured reports based on audio recordings or transcriptions of meetings. The Index 5 product is an advanced neural network model for video conferences that analyzes participants' levels of attentiveness, engagement, emotionality, fatigue, and distractibility. Such tools are integrated into the corporate digital ecosystem, significantly simplifying document management and saving employees' time. For text vectorization tasks and the deployment of speech models, "Neiroseti" tested the 1/7 and 3/7 MIG accelerators of the Nvidia A100 on the HPC Park’s platform. The test results confirmed that using MIG enables efficient resource utilization and high performance, which is particularly important when working with multitasking machine learning models, making the technology ideal for tasks that do not require large computational power. For example, on a single 3/7 instance of the NVIDIA A100 GPU, several complex operations were performed simultaneously: audio data was processed, embeddings were calculated, and tasks related to ranking operations were solved.
The embedding based on the GPT-2 model is a vectorization process that transforms text data into compact numerical representations, greatly simplifying further processing and analysis. This enables models to work effectively with text, capturing semantic relationships and contexts.
For ranking operations, the ReRank model is used — a lightweight architecture with approximately one billion parameters, designed to optimize search results. The main task of the model is query rebalancing and evaluating the relevance of search results, which helps achieve better alignment with user queries and enhances the overall system efficiency in real-world conditions.
In conclusion, it can be said that containers with fractional parts of the accelerator are the ideal solution when there is a need to quickly test applications, deploy a model, or perform simultaneous processing of multiple tasks.