Sunday, June 15, 2025

Top 5 This Week

Related News

KAYTUS presents enhanced MotusAI to accelerate LLM delivery

Optimized inference performance, tool compatibility, resource scheduling, and system stability to accelerate the deployment of large AI models.

KAYTUS , a leading provider of end-to-end AI and liquid cooling systems, today announced the release of the latest version of its MotusAI AI DevOps platform at ISC High Performance 2025. The enhanced MotusAI platform delivers significant improvements in large-model inference performance and is widely compatible with diverse open-source tools spanning the entire large-model lifecycle. Designed for unified and dynamic resource scheduling, the platform significantly improves resource utilization and operational efficiency in the development and deployment of large-scale AI models. This latest version of MotusAI will further accelerate AI adoption and drive business innovation in critical sectors such as education, finance, energy, automotive, and manufacturing.

As large-scale AI models are increasingly integrated into real-world applications, companies are deploying them at scale to generate tangible value across many different sectors. Yet, many organizations continue to face significant challenges in adopting AI, including long deployment cycles, strict stability requirements, fragmented open-source tool management, and low compute resource utilization. To address these pain points, KAYTUS has launched the latest version of its MotusAI AI DevOps platform, specifically designed to streamline AI deployment, improve system stability, and optimize the efficiency of AI infrastructure for running large models.

Improved inference performance to ensure service quality

Deploying AI inference services is a complex undertaking that includes provisioning, managing, and continuously monitoring the health of the services. These tasks require strict standards in model and service governance, performance optimization via acceleration frameworks, and long-term service stability. All of this typically requires significant investments in personnel, time, and technical expertise.

The updated MotusAI offers robust capabilities for deploying large models, perfectly balancing transparency and performance. By integrating optimized frameworks such as SGLang and vLLM, MotusAI ensures high-performance, distributed inference services that enterprises can deploy quickly and reliably. Designed to support models with extensive parameter sets, MotusAI leverages intelligent resource and network affinity scheduling to reduce startup time while maximizing hardware utilization. Built-in monitoring capabilities span the entire stack—from hardware and platforms to pods and services—providing automatic fault diagnosis and rapid service recovery. MotusAI also supports dynamic scaling of inference workloads based on real-time usage and resource monitoring, ensuring a more stable service.

Comprehensive support for tools to accelerate AI adoption

With the rapid advancement of AI modeling technologies, the complexity of the supporting ecosystem of development tools is also increasing. Developers need an optimized, universal platform to efficiently select, deploy, and operate these tools.

The updated MotusAI offers comprehensive support for a variety of leading open-source tools, enabling enterprise users to configure and manage their model development environments as needed. With integrated tools such as LabelStudio, MotusAI accelerates data annotation and synchronization across categories, improves data processing efficiency, and accelerates model development cycles. MotusAI also offers an integrated toolchain for the entire AI model lifecycle. This includes LabelStudio and OpenRefine for data annotation and management, LLaMA-Factory for fine-tuning large models, Dify and Confluence for developing large model applications, and Stable Diffusion for text-to-image generation. Together, these tools enable users to quickly deploy large models and greatly increase development productivity.

Hybrid training-inference planning on the same node to maximize resource efficiency

Efficient use of computing resources remains a high priority for AI startups and small to medium-sized enterprises in the early stages of AI adoption. Traditional AI clusters typically allocate compute nodes separately for training and inference tasks, limiting the flexibility and efficiency of resource scheduling for both types of workloads.

The updated MotusAI overcomes traditional limitations by enabling hybrid scheduling of training and inference workloads on a single node, enabling seamless integration and dynamic orchestration of different task types. MotusAI incorporates advanced GPU scheduling capabilities and supports on-demand resource allocation, allowing users to efficiently distribute GPU resources based on the workload of each task. MotusAI also offers multi-dimensional GPU scheduling, including fine-grained partitioning and support for multi-instance GPU (MIG), covering a wide range of use cases in model development, debugging, and inference.

MotusAI’s improved scheduler significantly outperforms community-based versions, delivering a 5x increase in task throughput and a 5x reduction in latency for large-scale POD deployments. It enables rapid startup and deployment of the environment for hundreds of PODs, while supporting dynamic workload scaling and tidal scheduling for both training and inference. These features enable seamless task coordination in a wide variety of real-world AI scenarios.

Also read: Viksit Workforce for a Viksit Bharat

Do Follow: The Mainstream formerly known as CIO News LinkedIn Account | The Mainstream formerly known as CIO News Facebook | The Mainstream formerly known as CIO News Youtube | The Mainstream formerly known as CIO News Twitter |The Mainstream formerly known as CIO News Whatsapp Channel | The Mainstream formerly known as CIO News Instagram

About us:

The Mainstream formerly known as CIO News is a premier platform dedicated to delivering latest news, updates, and insights from the tech industry. With its strong foundation of intellectual property and thought leadership, the platform is well-positioned to stay ahead of the curve and lead conversations about how technology shapes our world. From its early days as CIO News to its rebranding as The Mainstream on November 28, 2024, it has been expanding its global reach, targeting key markets in the Middle East & Africa, ASEAN, the USA, and the UK. The Mainstream is a vision to put technology at the center of every conversation, inspiring professionals and organizations to embrace the future of tech.

Popular Articles