Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to check if host-side reset needed #52

Open
KarmaPoliceT2 opened this issue Oct 24, 2024 · 1 comment
Open

Add option to check if host-side reset needed #52

KarmaPoliceT2 opened this issue Oct 24, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request internal_report

Comments

@KarmaPoliceT2
Copy link

Use-case: User is using a vm with a card via pcie passthrough (kvm/qemu). We would like tt-smi to be able to do an "alive check" for a card and respond back on the status. By providing this we will allow users who may not have host-level access to know when to alert their administrators that intervention is required to get their card operational again.

Ideally the intervention does not require rebooting the whole host (as other tenants may concurrently be using other cards on that system in their own vms)

@milank94 milank94 added the enhancement New feature or request label Oct 30, 2024
@TTDRosen
Copy link
Contributor

Could you be more specific about what you see the workflow being? In general the easiest way to know if the host needs a reboot is to attempt a reset. If the reset fails then the next step would be to request for admin intervention. Are you looking for a way to check the status of the cards which doesn't require a reset to be issued?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request internal_report
Projects
None yet
Development

No branches or pull requests

5 participants