Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Script Extensions timeout and I end up having to re-image the VM #26

Open
andwie opened this issue Aug 15, 2017 · 10 comments
Open

Comments

@andwie
Copy link

andwie commented Aug 15, 2017

We have all our settings in ARM templates and each time we change a setting (add a NSG rule for example) we run New-AzureRmResourceGroupDeployment to create a new deployment and propagate the setting change to Test, Demo and then Production. I've found we sometimes get CSE (custom script extension) timeouts during this process on our VMs. Especially with the release of version 1.9 of custom script extension for windows. See the comments here for other folks encountering CSE issues:
https://docs.microsoft.com/en-us/azure/virtual-machines/windows/extensions-customscript

When this happens I have to delete the CSE via the Portal and re-run the deployment. Problem is you can't delete CSE objects for VM Scale Sets in the Azure Portal. I tried running the powershell below:

` $rpVMSS = Get-AzureRmVmss -ResourceGroupName "" -VMScaleSetName ""

  $rpVMSS = Remove-AzureRmVmssExtension -VirtualMachineScaleSet $rpVMSS -Name DscClientExtension

   Update-AzureRmVmss -ResourceGroupName <resourcegroupname> -Name <scalesetname> -VirtualMachineScaleSet $rpVMSS`

But it errors out saying:
Update-AzureRmVmss : Long running operation failed with status 'Failed'.
ErrorCode: VMExtensionProvisioningError
ErrorMessage: Multiple VM extensions failed to be provisioned on the VM. Please see the VM extension instance view for details.
StartTime: 8/15/2017 11:52:17 AM
EndTime: 8/15/2017 11:52:20 AM
OperationID:
Status: Failed
At line:4 char:8

If I'm dealing with a regular VM I can easily delete the extension but if it is a scaleset VM I have to reimage the entire VM.

@andwie
Copy link
Author

andwie commented Aug 15, 2017

One other related issue is the "Remove-AzureRmVmssExtension" delete operation never seems to finish. If I try to add the CSE back to the scaleset via the add-azurermvmssextension command I get this error:

Update-AzureRmVmss : Operation 'PUT' is not allowed on VM extension 'DscClientExtension' since it is marked for deletion. You can only retry the Delete operation (or wait for an ongoing one to complete).
ErrorCode: OperationNotAllowed
ErrorMessage: Operation 'PUT' is not allowed on VM extension 'DscClientExtension' since it is marked for deletion. You can only retry the Delete operation (or wait for an ongoing one to complete).
StatusCode: 409
ReasonPhrase: Conflict

@AayushBhatt
Copy link

Hi @andwie ... Did you get any solution to this issue ?
I am also getting the same error while trying to remove the OMS VM extension. Any guidance would be of great help. Thanks.

@gatneil
Copy link
Contributor

gatneil commented Apr 19, 2018

What is your upgrade policy?

@AayushBhatt
Copy link

Upgrade policy for SF cluster is set to automatic.

@gatneil
Copy link
Contributor

gatneil commented Apr 26, 2018

This looks like a bug to me. Are you able to submit a support request?

@dirkslab
Copy link

Hi. Similar issue. Renamed IaaSDiagnostics Extension type in ARM template. The old "named" extension is now causing CI/CD deployments to fail as deployment always times out.
"message": "Provisioning of VM extension 'VMDiagnosticsVmExt_vmNt1Name' has timed out. Extension installation may be taking too long, or extension status could not be obtained."

Trying to change the name back fails with
"message": "Operation 'PUT Extension' is not allowed on VM extension 'VMDiagnosticsVmExt_vmNt1Name' since it is marked for deletion. You can only retry the Delete operation (or wait for an ongoing one to complete)."

get-azurermvmss shows the new named extension "VMDiagnosticsVmExt_vmNodeType0Name" as the only extension associated with the scaleset.
Remove-AzureRmVmssDiagnosticsExtension cannot find the "stuck" extension VMDiagnosticsVmExt_vmNt1Name

@adeelilyas
Copy link

Remove the existing extension and add it back with a new name.

$vmss = Get-AzureRmVmss -ResourceGroupName $ResourceGroupName -VMScaleSetName $vmssResource.Name
$settings = $vmss.VirtualMachineProfile.ExtensionProfile.Extensions[0].Settings
$vmss = Remove-AzureRmVmssExtension -VirtualMachineScaleSet $vmss -Name "ExistingExtName"
Update-AzureRmVmss -ResourceGroupName $ResourceGroupName -Name $vmssResource.Name -VirtualMachineScaleSet $vmss 

$vmss = Get-AzureRmVmss -ResourceGroupName $ResourceGroupName -VMScaleSetName $vmssResource.Name
$vmss = Add-AzureRMVmssExtension -VirtualMachineScaleSet $vmss `
    -Name "NewExtName" `
    -Publisher "Microsoft.Azure.Extensions" `
    -Type "CustomScript" `
    -TypeHandlerVersion 2.0 `
    -Setting $settings `
    -AutoUpgradeMinorVersion $true

Update-AzureRmVmss -ResourceGroupName $ResourceGroupName -Name $vmssResource.Name -VirtualMachineScaleSet $vmss

@avibha28
Copy link

avibha28 commented Nov 4, 2020

vmss instances donot update after you uninstall cse extension. run below command and then add cse extension:
az vmss update-instances --instance-ids '*'
--resource-group $CLUSTER_RESOURCE_GROUP
--name $SCALE_SET_NAME

@ashinzekene
Copy link

Any update on this? Still having the issue when deploying with ARM templates

@brianacraig
Copy link

The solution @avibha28 posted worked for me. After I ran that update-instances command I was able to reinstall the extension (via the portal in my case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants