Skip to content

This repository contains documentation on how to setup and manage a simulation cluster on your cloud provider.

License

Notifications You must be signed in to change notification settings

mathworks/simulation-platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Simulation Platform

The simulation-platform repository contains documentation on how to setup and manage a simulation cluster on your cloud provider.

Introduction

The simulation cluster provides capability to run large scale simulations on cloud infrastructure using a Kubernetes® cluster and cloud-native components. Once the cluster is setup, then users can submit sim jobs to the cluster using the Large-scale cloud simulation for Simulink® support package for the MATLAB® software as a client.

Using your cloud-provider account, the simulation cluster provisions a group of cloud resources, open-source components and default configuration to stand up the necessary infrastructure.

mwcsim

mwcsim is a command line tool to easily manage simulation clusters on a cloud provider. It provides convenience functionality to create, delete and manage a cluster. Advanced users are encouraged to modify the Terraform® file and Kubernetes manifests underlying the cluster and apply them to get the customized cluster you need, especially for massive simulation needs or custom integration requirements with existing systems.

Prerequisites

Currently the simulation cluster is only supported on the Amazon Web Services (AWS)™ cloud provider. The mwcsim CLI is only supported on the Linux® operating system.

Before proceeding further, you can test the prerequisites with the following commands. Especially, confirm that you can access your AWS account with the AWS CLI.

aws configure
tofu version
docker version

AWS security requirements

Cluster creation from scratch

If the simulation cluster is being created from scratch, we recommend a dedicated clean AWS account to minimize potential conflicts with existing resources or quota limits. The AWS managed PowerUserAccess security policy is required for the user creating the cluster. More details about the AWS managed policies for job function can be read here.

Cluster creation on existing resources

Feature currently unavailable, coming soon

A simulation cluster can be created with reduced permission requirements using existing resources. This feature is coming soon.

Installation

The easiest method is to pull the csim-mwcsim container, which brings all the configuration files along with the CLI.

docker pull containers.mathworks.com/simulation-platform/csim-mwcsim:v0.5.32

Install the CLI with the following commands. Note the mounted directory should remain /tmp/mwcsim for a smooth installation. The install.sh command may prompt for sudo access.

mkdir -p /tmp/mwcsim
docker run --mount type=bind,source=/tmp/mwcsim,target=/mount containers.mathworks.com/simulation-platform/csim-mwcsim:v0.5.32
/tmp/mwcsim/install.sh

This installs the CLI into /usr/local/bin and the configuration files to $HOME/.mwcsim, which is the default working directory of mwcsim. You can test the installation by checking the version.

mwcsim version

Getting started

This section outlines quick introductory setups to get a cluster ready. Advanced usage documentation are further below.

For all commands below, my-cluster, my-username, mydomain.example.com, loadbalancer-endpoint are placeholders that should be replaced by your desired values. If you encounter any issues, please check the known-issues page as well. A lot of problems are related to resource provisioning delays and a simple retry after a few minutes may resolve it.

Manually provisioned domain

This approach requires access to DNS and the ability to assign a domain to the cluster. In the example below, mydomain.example.com is the domain where the cluster will be reachable by its clients. The domain needs to be known before-hand, but the records are set midway through the setup after the cluster loadbalancer endpoint is provisioned.

STEP 1

Make sure AWS access is setup.

aws configure
STEP 2

Create stable components of cluster first, that supports MATLAB version R2024bUpdate2 clients. Note that if you encounter issues, you can safely try rerunning the create command. Occasionally there are resource provisioning limits or delays that may cause an error.

mwcsim create cluster my-cluster --domain mydomain.example.com -v R2024bUpdate2 -x stableblock
STEP 3

Obtain loadbalancer endpoint, hereon referred to as loadbalancer-endpoint.

mwcsim get cluster my-cluster -o yaml | grep loadbalancer
STEP 4

Manually provision domains on your DNS provider.

# the following records are necessary (all 4 point to the same loadbalancer endpoint)
#   CNAME mydomain.example.com -> loadbalancer-endpoint
#   CNAME idp.mydomain.example.com -> loadbalancer-endpoint
#   CNAME oauth.mydomain.example.com -> loadbalancer-endpoint
#   CNAME registry-access.mydomain.example.com -> loadbalancer-endpoint
STEP 5

Create the second half of the cluster components that are dependent on the domain. Note that if you encounter issues, you can safely try rerunning the create command. Occasionally there are DNS propagation and HTTPS certification delays that may cause an error.

mwcsim create cluster my-cluster -x dynamicblock
STEP 6

Create a first user, this will return an auto-generated password. Note the username and password.

mwcsim create user my-username -c my-cluster
STEP 7

Get the cluster endpoint and note this down.

mwcsim get cluster my-cluster
Using the cluster

Once the cluster is ready, use the MATLAB client documentation to send large-scale simulation jobs to the cluster, using the cluster endpoint and user credentials from above.

Automatically provisioned domain (beta)

mwcsim can automatically provision domains as an experimental feature. This feature is only supported on versions R2024bUpdate3 and later. Automatically provisioned domains don't require DNS access or expertise. Contact us if you're interested in learning more and using it.

STEP 1

Make sure AWS access is setup.

aws configure
STEP 2

Create cluster.

mwcsim create cluster my-cluster
STEP 3

Create a first user, this will return an auto-generated password. Note the username and password.

mwcsim create user my-username -c my-cluster
STEP 4

Get the cluster endpoint and note this down.

mwcsim get cluster my-cluster
Using the cluster

Once the cluster is ready, use the MATLAB client documentation to send large-scale simulation jobs to the cluster, using the cluster endpoint and user credentials from above.

Create and destroy simulation clusters

For all commands below, my-cluster and my-region are placeholders that should be replaced by your desired values.

Get help on the mwcsim command

You can get help on the root mwcsim command or any sub-commands like below.

mwcsim -h
mwcsim create -h
mwcsim create cluster -h

Create a simulation cluster

Create a cluster in the default region (us-east-1) with a given name.

mwcsim create cluster my-cluster

Create a cluster in a specific region.

mwcsim create cluster my-cluster -r my-region -v R2024bUpdate2

Note the -v flag for the cluster version. The currently supported versions are R2024bUpdate2 and R2024bUpdate3. The version is relevant as the cluster only supports the corresponding MATLAB clients of the same version or later.

Partially create a cluster

mwcsim create cluster my-cluster -x <operation>

The supported operations are:

"init"          initialize cluster in mwcsim backend
"tfplan"        run terraform plan
"tfapply"       run terraform apply
"kustomize"     build kustomization files
"kubeapply0"    apply wave 0 kubernetes resources
                  after "kubeapply0", the loadbalancer endpoint is available for DNS records
"domain"        provision domain automatically if not manually provisioned
"kubeapply1"    apply wave 1 kubernetes resources
"configure"     configure kubernetes resources
                  after "configure", the cluster should be ready to use
  ---
"all"           run all operations above (top > bottom) except tfplan
                  only successful with auto-provisioned domain
"stableblock"   run "init", "tfapply", "kustomize" and "kubeapply0" operations
                  this block of operations do not require a valid domain
"dynamicblock"  run "domain", "kubeapply1", "configure" operations
                  this block of operations require a valid domain, if manually provisioned
(default "all")

All operations above can be safely rerun if there are any issues during setup, so you can remedy the issue (such as exceeding max number of VPC or IP slots available) on the cloud-provider and retry where you left off. The creation process proceeds from the top to bottom until the cluster is ready. You can resume an operation at any point.

The partial setup is especially useful if you want to modify cluster terraform files or kubernetes files midway, after certain resources are provisioned or details are available.

Get simulation clusters

Get a list of current simulation clusters.

mwcsim get cluster

# example output
NAME        ENDPOINT                  REGION     STORAGE                   STATE       VERSION
my-cluster  mycluster.example.com     us-east-2  csim-my-cluster-3iaopg0e  ITK0D--(-)  R2024bUpdate2
cluster2    ecgyp7msgpj7r1.mwcsim.io  us-east-1  csim-cluster2-bzdxbn4f    ITK0D1C(R)  R2024bUpdate3

Get a specific cluster.

mwcsim get cluster my-cluster

Output details in JSON or yaml.

mwcsim get cluster -o json
mwcsim get cluster my-cluster -o yaml

Meaning of the state field

The state field signifies in short hand what the cluster state is currently. You can use this as a guide to understand which operations to perform when partially creating or destroying a cluster. The states are summarized as follows:

(I)nitialized      mwcsim knows about the cluster and tracks it, but no cloud provider resources are provisioned
(T)erraform        analogous to "terraform apply", cloud resources are provisioned 
(K)ustomize        similar to step before "kustomize build", where the cluster specific kustomization files are generated
KubernetesWave(0)  analogous to "kubectl apply", for certain subset of objects that are domain independent
(D)omain           step where a cluster domain is provisioned, this can be automatically or manually done
KubernetesWave(1)  analogous to "kubectl apply", for the rest of objects that are domain dependent
(C)onfigured       configure identity provider and other microservices
(R)eady            cluster is ready for use

So a cluster with state I------(-) signifies that it's only in the initialized state, with no terraform or kubernetes resources applied (hence no cloud resources provisioned, except for the backend storage of mwcsim itself). However, if the tfapply operation was attempted but failed, this state can still show while some partial resources are provisioned, as the (T)erraform state is only set after successful completion of the tfapply operation.

A cluster in the ITK0D1C(R) state is fully provisioned and ready to use.

As another example, a cluster in the ITK----(R) has the terraform resources applied, and the cluster specific kustomization files generated, but no kubernetes resources are created or configured yet.

Note that the cluster state field is a guideline only. It is not the source of truth for the cluster state. mwcsim uses the actual cluster resources as a source of truth. So the state field might be inaccurate depending on actions or circumstances outside of the mwcsim usage (for example, someone deleted a resource manually).

Delete a simulation cluster

Delete a cluster and all associated resources, except the data.

mwcsim delete cluster my-cluster

The command above will error out at the end with one remaining resources which is the storage location, if it's not empty.

If you want to delete the data, you can do so with a flag.

mwcsim delete cluster my-cluster --delete-data

WARNING: The data is deleted permanently and is unrecoverable.

Partial destruction of a cluster

As with the creation commands, you can partially delete the cluster as below.

mwcsim delete cluster my-cluster -x <operation>

The supported operations are:

specify granular operations to perform, must be one of:
  "tfplan"         run terraform plan
  "kubedelete1"    delete wave 1 kubernetes resources
  "releasedomain"  release domain if auto-provisioned
  "kubedelete0"    delete wave 0 kubernetes resources
  "deletedata"     delete storage contents (--delete-data flag also required)
  "tfdestroy"      run terraform destroy
  "deinit"         de-initialize cluster from mwcsim backend
  "all"            run all operations (top > bottom) except tfplan
(default "all")

All operations above can be rerun if there are any issues during deletion, so you can remedy the issue (such as cannot remove non-empty storage) and retry where you left off.

Configure the simulation cluster

Get the simulation cluster endpoint

Once a cluster is created, you can use the endpoint returned by mwcsim to submit jobs and access the cluster administration tools. Follow the domain manually provisioning steps above in the getting started section to provide your own domain.

mwcsim get cluster my-cluster

Output:

# example output
NAME        ENDPOINT                  REGION     STORAGE                   STATE       VERSION
my-cluster  mycluster.example.com     us-east-2  csim-my-cluster-3iaopg0e  ITK0D--(-)  R2024bUpdate2

The main endpoint is used to submit jobs to the cluster from clients. The admin endpoint is on another URL similar to the endpoint and is used to manage the cluster and users. The admin endpoints are different depending on the version of the cluster, as outlined below.

#admin endpoint
FORMAT           EXAMPLE                     VERSION
idp.<endpoint>   idp.mycluster.example.com   R2024bUpdate2
<endpoint>/auth  mycluster.example.com/auth  R2024bUpdate3

Admin credentials

After the cluster is successfully created, the admin credentials can be obtained as below.

mwcsim get user admin -c my-cluster --credential

Note the automatically created admin user can only be used for cluster management. You'll need to create users to actually utilize the cluster.

Create users

A convenience function is provided to easily create a user.

mwcsim create user my-username -c my-cluster

For full user management features, access the admin endpoint. The cluster uses the Keycloak® software by default to manage users and authentication. You can manage users under the mwcsim realm created for the cluster using the Keycloak instructions here. Make sure to create and manage users in the mwcsim realm and not the master realm. Users created in the master realm will not work.

Optional single sign-on (SSO)

If you want to connect to an external identity provider and use SSO to log into the simulation cluster, you can use the keycloak admin interface to set it up. Instructions can be found in the Keycloak documentation.

Using the simulation cluster

Currently only the MATLAB client is supported to work with the cluster. New clients will be added in the future.

Connect to the simulation cluster

Once you have the cluster endpoint and user created, you can connect to it from MATLAB. Follow the instructions on the MATLAB support package "Large-scale cloud simulation for Simulink" documentation to run large scale simulations.

Advanced Configuration

Instructions on advanced configuration of the underlying Terraform files and Kubernetes manifests can be found here.

Security

The default simulation cluster created is publicly accessible by authenticated users via encrypted HTTPS. The simulation and model data is stored in a private S3™ bucket, where the workers read input data from and write output data to. It is recommended that users modify the Terraform, following the advanced configuration instructions above, to set the VPC to a private peered network or set a security group such that the access is restricted to your VPN IP ranges.

Additionally, comprehensive security monitoring services such as the AWS GuardDuty™ service can be setup on your own to monitor the cluster.

Support

Known issues and troubleshooting

Known issues and troubleshooting details can be found here.

Request enhancements

To suggest additional features or capabilities, see Request Reference Architectures.

Get technical support

If you require assistance, contact MathWorks Technical Support.

License

MATHWORKS CLOUD REFERENCE ARCHITECTURE LICENSE © 2024 The MathWorks, Inc.

About

This repository contains documentation on how to setup and manage a simulation cluster on your cloud provider.

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published