Skip to Content
ClustersMPI Support

MPI Support

What is MPI?

MPI (Message Passing Interface) is a standardized and portable message-passing system designed to allow programs written in C, C++, or Python to run in parallel on distributed memory architectures. It is the de-facto standard for parallel programming in high-performance computing (HPC).

Common Use Cases for MPI: MPI is widely used in scientific and engineering fields for computationally intensive tasks, including:

  • Scientific Simulations: Climate modeling, molecular dynamics, fluid dynamics, and astrophysical simulations.
  • Numerical Analysis: Solving large systems of equations, matrix operations, and optimization problems.
  • Data Processing: Large-scale data analysis, image processing, and machine learning (for certain types of parallel training).
  • High-Performance Computing (HPC): Any workload requiring tight coupling between processes and efficient communication across multiple compute nodes.

Running MPI Workloads with Shoc Platform

Shoc Platform provides robust support for running your MPI workloads seamlessly on your attached Kubernetes clusters. This enables you to leverage the scalability and resource management capabilities of Kubernetes for your tightly coupled parallel applications.

Prerequisite: MPI Operator

To enable MPI workload orchestration on your Kubernetes cluster through Shoc Platform, your Kubernetes cluster must have the MPI Operator installed. The MPI Operator is a Kubernetes native solution that makes it easier to run MPI jobs. It is part of the Kubeflow ecosystem.

⚙️

Kubernetes Cluster Requirement: Ensure your Kubernetes cluster is online and correctly configured with Shoc Platform as described in the Clusters page before proceeding with the MPI Operator installation.

MPI Operator Installation

You can install the MPI Operator on your Kubernetes cluster using kubectl from the Kubeflow manifests. For the latest stable release of the MPI Operator, you can apply the manifests directly from GitHub.

kubectl apply --server-side -f https://raw.githubusercontent.com/kubeflow/mpi-operator/master/deploy/v2beta1/mpi-operator.yaml

This command will deploy the necessary Custom Resource Definitions (CRDs) and controller for the MPI Operator into your Kubernetes cluster, allowing Shoc Platform to submit and manage MPI-specific job types.

Further Documentation

For detailed information on the MPI Operator, its features, and advanced usage, please refer to the official Kubeflow documentation:

For a deeper understanding of MPI programming and its concepts, you may find the following resources helpful:

Last updated on