BOS v2 provides a components endpoint where BOS tracks the status of individual components. This includes information on the desired state and some information on the current state, status, and any session the component is part of.
Component records are created automatically and will include any components found in the Hardware State Manager (HSM).
Stores information on what BOS believes is the current boot artifacts the component is booted with. This is updated by the BOS state reporter, which runs on the booted node.
For more information, see BOS state reporter.
See Options for more information on setting the component_actual_state_ttl
option, which controls how long this data is valid if it is not updated.
Information stored in other locations, such as the current configuration which is stored in CFS, is not included here.
Stores information on the desired boot artifacts and configuration for a component. If the component is enabled, BOS will try to make the actual state match the desired state.
Stores information on the eventual desired boot artifacts and configuration for a component. This has no immediate impact, but is moved to the desired state when the applystaged
endpoint is called for the component.
If the node is enabled (enabled == True), BOS will take action to make the actual state match the desired state. If a node is disabled (enabled == False), BOS will take no action against a node. This is an internal state that BOS uses for tracking whether it has finished working on a node. Typically, users should never enable a node (enabled == True). BOS handles this during session creation. Even if the BOS session is deleted, the nodes remain enabled in BOS, and BOS will continue to take action to make the nodes' actual states match their desired states. Because of this, if an administrator wishes to stop BOS from taking such actions on a node, then they must disable it (enabled == False). Thus, while it is still uncommon, it is more likely that users will disable nodes than enable them.
Even if a node is disabled in BOS, if it is booted, then BOS may receive status updates for it. However, BOS will not issue power commands to the nodes while they are disabled. These status updates come from the bos-reporter
, small program running
periodically on the nodes, that updates the nodes' actual status.
Both BOS and the Hardware State Manager (HSM) use the term disabled, but not in a consistent fashion. When the HSM says a node is disabled, it is out of service. This definition should not be confused with BOS' definition of disabled.
If the most recent action taken has failed or BOS is unable to continue operating on a component, an error message will be stored here.
These fields store data from the period of time when BOS begins operating on a component until the component either reaches its desired state or hits its retry limit and is marked failed. The stored data includes the number of power on and power off attempts that have been tried for the component.
"event_stats": {
"power_off_forceful_attempts": 3,
"power_off_graceful_attempts": 1,
"power_on_attempts": 1
}
The number of attempts is limited by the default_retry_policy
found in BOS options endpoint. See Options for more information.
Tracks information on the most recent action that BOS took on a component. This can be a power action, such as powering_on
, or a management action such as newly_discovered
.
Stores the session ID of the session that is currently tracking the component.
This collection of fields stores status information that the BOS operators and other users can query to determine the status of the component. Status fields should generally not be manually updated and should be left to BOS. These fields include:
phase
- Describes the general phase of the boot process the component is currently in, such aspowering_on
,powering_off
andconfiguring
.status
- A more specific description of where in the boot process the component is. This can be more detailed phases, such aspower_on_pending
,power_on_called
, as well as final states such asfailed
.on_hold
is a special value that indicates BOS is re-evaluating the status of the component, such as when a component is re-enabled and BOS needs to collect new information from other services to determine the state of the component.status_override
- A special status field that is used to overridestatus
when BOS would be unable to determine the status of the node with its current information. This includes theon_hold
status.
To find the API versions of any commands listed, add -vvv
to the end of the CLI command, and the CLI will print the underlying call to the API in the output.
List all BOS components with the following command:
(ncn-mw#
)
cray bos v2 components list --format json
Example output:
[
{
"actual_state": {
"boot_artifacts": {
"initrd": "",
"kernel": "",
"kernel_parameters": ""
},
"bss_token": "",
"last_updated": "2022-08-22T09:17:16"
},
"desired_state": {
"boot_artifacts": {
"initrd": "",
"kernel": "",
"kernel_parameters": ""
},
"bss_token": "",
"configuration": "",
"last_updated": "2022-08-22T09:17:16"
},
"enabled": false,
"error": "",
"event_stats": {
"power_off_forceful_attempts": 0,
"power_off_graceful_attempts": 0,
"power_on_attempts": 0
},
"id": "x3000c0s13b0n0",
"last_action": {
"action": "newly_discovered",
"last_updated": "2022-08-22T09:17:16"
},
"session": "",
"staged_state": {
"last_updated": "2022-08-22T09:17:16"
},
"status": {
"phase": "",
"status": "stable",
"status_override": ""
}
}
]
Get details for a BOS session using the component xname
.
(ncn-mw#
):
cray bos v2 components describe <XNAME> --format json
Example output:
{
"actual_state": {
"boot_artifacts": {
"initrd": "",
"kernel": "",
"kernel_parameters": ""
},
"bss_token": "",
"last_updated": "2022-08-22T09:17:16"
},
"desired_state": {
"boot_artifacts": {
"initrd": "",
"kernel": "",
"kernel_parameters": ""
},
"bss_token": "",
"configuration": "",
"last_updated": "2022-08-22T09:17:16"
},
"enabled": false,
"error": "",
"event_stats": {
"power_off_forceful_attempts": 0,
"power_off_graceful_attempts": 0,
"power_on_attempts": 0
},
"id": "x3000c0s13b0n0",
"last_action": {
"action": "newly_discovered",
"last_updated": "2022-08-22T09:17:16"
},
"session": "",
"staged_state": {
"last_updated": "2022-08-22T09:17:16"
},
"status": {
"phase": "",
"status": "stable",
"status_override": ""
}
}
Update a BOS component using xname
. While most fields can be updated manually, users should restrict themselves to updating the desired_state
and enabled
. Altering other fields such as status
or last_action
may result in unintended behavior.
See the enabled
section for cautions about updating a component's enabled
state.
(ncn-mw#
):
cray bos v2 components update <XNAME> --enabled False --format json
Example output:
{
"actual_state": {
"boot_artifacts": {
"initrd": "",
"kernel": "",
"kernel_parameters": ""
},
"bss_token": "",
"last_updated": "2022-08-22T09:17:16"
},
"desired_state": {
"boot_artifacts": {
"initrd": "",
"kernel": "",
"kernel_parameters": ""
},
"bss_token": "",
"configuration": "",
"last_updated": "2022-08-22T09:17:16"
},
"enabled": true,
"error": "",
"event_stats": {
"power_off_forceful_attempts": 0,
"power_off_graceful_attempts": 0,
"power_on_attempts": 0
},
"id": "x3000c0s13b0n0",
"last_action": {
"action": "newly_discovered",
"last_updated": "2022-08-22T09:17:16"
},
"session": "",
"staged_state": {
"last_updated": "2022-08-22T09:17:16"
},
"status": {
"phase": "",
"status": "stable",
"status_override": ""
}
}