ConMan is used for connecting to remote consoles and collecting console logs. These node logs can then be used for various administrative purposes, such as troubleshooting node boot issues.
ConMan runs on the system in a set of containers within Kubernetes pods named cray-console-operator
and cray-console-node
.
The cray-console-operator
and cray-console-node
pods determine which nodes they should monitor by checking with the
Hardware State Manager (HSM) service. They do this once when they starts. If HSM has not discovered some nodes when
they start, then HSM is unaware of them and therefore so are the cray-console-operator
and cray-console-node
pods.
Verify that all nodes are being monitored for console logging and connect to them if desired.
See ConMan for other procedures related to remote consoles and node console logging.
This procedure can be run from any member of the Kubernetes cluster to verify node consoles are being managed by ConMan and to connect to a console.
NOTE
this procedure has changed since the CSM 0.9 release.
-
(
ncn-mw#
) Find thecray-console-operator
pod.OP_POD=$(kubectl get pods -n services \ -o wide|grep cray-console-operator|awk '{print $1}') echo $OP_POD
Example output:
cray-console-operator-6cf89ff566-kfnjr
-
(
ncn-mw#
) Find thecray-console-node
pod that is connected to the node.Be sure to substitute the actual component name (xname) of the node in the command below.
XNAME=<xname> NODEPOD=$(kubectl -n services exec $OP_POD -c cray-console-operator -- sh -c "/app/get-node $XNAME" | jq .podname | sed 's/"//g') echo $NODEPOD
Example output:
cray-console-node-2
-
(
ncn-mw#
) Log into thecray-console-node
container in this pod:kubectl exec -n services -it $NODEPOD -c cray-console-node -- bash
Example output:
cray-console-node#
-
Check the list of nodes being monitored.
conman -q
Output looks similar to the following:
x9000c0s1b0n0 x9000c0s20b0n0 x9000c0s22b0n0 x9000c0s24b0n0 x9000c0s27b1n0 x9000c0s27b2n0 x9000c0s27b3n0
-
Compute nodes or UANs are automatically added to this list a short time after they are discovered.
-
To access the node's console, run the following command from within the pod. Again, remember to substitute the actual component name (xname) of the node.
conman -j <xname>
NOTE
The console session can be exited by entering&.