Skip to main content

Phoenix v1.6

· 4 min read
Alexander Fandos
Software Engineer @ Midokura

We are pleased to announce the v1.6 release of Phoenix.

Overview

The updated Operator Reference sheet and release notes are included with this message. They describe the revised steps, configuration details, and changes for provisioning and managing a Phoenix cluster under the new release.

The new Kubernetes cluster creation feature has not yet gone through automatic full end-to-end testing, as it is still being finalized. However, it has been manually tested, and a demonstration video is included as proof of successful operation:

https://drive.google.com/file/d/1ElYFXWPpnZtXQFolbgORZI4WPTb2K3z_/view

Thank you to the entire team for making this release possible!

Have a great week!

Features

Fully automated Hedgehog provisioning

Network provisioning for Hedgehog is now fully automated through the setup-platform script using the --bootstrap option. This process installs Hedgehog on the bastion machine and automatically configures the required network fabric, removing the need for several previously manual steps.

K8S cluster creation with GPU support

Kubernetes clusters can now be created on GPU-enabled hypervisors, allowing clusters to run on infrastructure that provides GPU resources.

Operator reference

This is the reference sheet for Phoenix v1.6, an end-to-end solution to operate private, multi-tenant AI factories. Operators will find below an overview of the materials, infrastructure, and other requirements, and an entry point to the procedure to provision and configure the system.

Please contact support@midokura.com for more information.

System requirements

Note: documentation files referenced here are provided in a downloadable artefact included in the environment setup section.

  • Before proceeding, operators are expected to ensure that the underlying infrastructure meets the system requirements listed below.
  • Operating system requirements for the OpenStack control nodes are available in the documentation file ./service-operator/OS_REQUIREMENTS.md
  • Operators are expected to set up their hardware according to our official Blueprint, specifically with regard to network configuration, port and interface assignment.
    • Base Operating System for OSt controllers should be ubuntu-24.04
  • Storage. Operators are expected to provide a Ceph cluster, integrated in the infrastructure as defined in the blueprint. See more details in the Environment setup.
  • Set up a new Google Application that will be used as an SSO provider for the IaaS service. To follow this process, consult the ./service-operator/GOOGLE_SSO_SETUP.md file in the documentation bundle described below.
  • Set up credentials for the private registry at ghcr.io/midokura. We will provide you with this token via secure means, and it will be required during the control plane installation process (more info ./service-operator/GHCR_AUTHENTICATION.md).

Overview

The sections below provide references to materials required to proceed with the provisioning process, which takes place from the Bastion node shown in the blueprint. On a high level, the process is based on a bundle of Ansible playbooks that will install and configure all components in the control plane.

Environment setup

To install the Phoenix cluster, the Operator will work from the bastion node reflected in the blueprint. The materials below must be available in the node before proceeding with the installation.

  1. Create a new directory ./phoenix. This will serve to store artefacts and playbooks. All commands and paths in this document are relative to this directory.
  2. Download and extract the Documentation bundle. We will refer to documentation files from different sections of this document.

Control plane installation

  • Prepare the Ceph cluster by following the steps explained in the documentation file ./service-operator/CEPH_SETUP.md.
  • Download and extract Ansible playbooks.
  • Use the included inventory.example.yml as the base to input the configuration specific to your cluster.
  • Execute them following the instructions in ./service-operator/DEPLOYMENT.md
  • To configure switches, follow the instructions in ./service-operator/NETWORK_CONTROL_NODE_SETUP.md starting step 4.

IaaS Console - Tenant and User configuration

To create additional admin users, register tenants and tenant users, please refer to the instructions in ./service-operator/IAAS_CONSOLE_CONFIGURATION.md