An Azure service that is used to provision Windows and Linux virtual machines.
Hello @Vipul Om
Thanks for the detailed write-up you've given us exactly what we need to diagnose this. Good news: the root cause is clear, and there's a structured recovery path that preserves your data.
Root Cause: Two Related Issues
The Serial Console output tells the whole story:
1.Kernel panic - not syncing: Attempted to kill init! This is the primary cause of the outage. The init process (systemd/PID 1) crashed during boot, which is why SSH, HTTP, and Run Command all fail — the OS never fully starts. This is commonly triggered by a missing or corrupted initramfs, a failed kernel/package update, or corrupted system core libraries.
2.WALinuxAgent `[Errno 2] No such file or directory: 'python3' This is a symptom, not a separate problem. Python3 being missing points to broader system file corruption, consistent with a bad package update or incomplete OS upgrade that left critical binaries in a broken state.
Since Redeploy + Reapply didn't help (expected — those don't fix OS-level corruption), the right path is an offline disk repair.
Recovery Plan — Start Here (Fastest Path)
Step 1: Try ALAR Automated Repair First
Microsoft's Azure Linux Auto Repair (ALAR) tool can automatically regenerate a missing initramfs and GRUB configuration in an unattended mode — no manual disk swap required. Run the following
This creates a temporary repair VM, fixes the initramfs/GRUB config on a copy of your OS disk, swaps the disk back, and cleans up the repair VM automatically. **Your data is not touched.
Recover Azure Linux VM from kernel-related boot issues
https://learn.microsoft.com/en-us/troubleshoot/azure/virtual-machines/linux/kernel-related-boot-issues
Step 2: If Step 1 Doesn't Resolve It — ALAR Kernel Repair
If the issue is specifically with a broken kernel (not just missing initramfs), use the kernel rollback option:
This replaces the broken kernel with the previously installed working version.
Use Azure Linux Auto Repair (ALAR) to fix a Linux VM
https://learn.microsoft.com/en-us/troubleshoot/azure/virtual-machines/linux/repair-linux-vm-using-alar
Step 3: If ALAR Doesn't Fully Resolve It — Manual Repair VM**
If the corruption goes deeper (missing core system libraries, which the python3 error hints at), you'll need to go in manually via a repair VM and chroot:
- Create a repair VM with the OS disk attached:
az vm repair create -g <YourResourceGroup> -n <YourVMName> \
--repair-username repairadm --repair-password 'Repair@1234!'
- SSH into the repair VM, enter the chroot environment, and investigate:
- Reinstall missing packages if identified (inside chroot), regenerate initramfs, and update GRUB:
# Regenerate initramfs (Ubuntu example)
update-initramfs -u -k all
update-grub
# Regenerate initramfs (RHEL/CentOS example)
dracut -f --regenerate-all
grub2-mkconfig -o /boot/grub2/grub.cfg
- Restore the disk back to the original VM:
az vm repair restore -g <YourResourceGroup> -n <YourVMName>
Repair a Linux VM by using the Azure Virtual Machine repair commands
https://learn.microsoft.com/en-us/troubleshoot/azure/virtual-machines/linux/repair-linux-vm-using-azure-virtual-machine-repair-commands
Troubleshoot a Linux VM by attaching the OS disk to a recovery VM
https://learn.microsoft.com/en-us/troubleshoot/azure/virtual-machines/linux/troubleshoot-recovery-disks-portal-linux
After the VM Boots Fix WALinuxAgent
Once the OS is back up, fix the python3/WALinuxAgent issue:
Troubleshoot the Azure Linux Agent
https://learn.microsoft.com/en-us/troubleshoot/azure/virtual-machines/linux/linux-azure-guest-agent
Last Resort: Restore from Your Snapshot
You mentioned a snapshot was taken before troubleshooting. If all repair attempts fail, you can restore from that snapshot to get the VM back quickly, then investigate the root cause separately on a copy.
How to restore Azure VM data in Azure portal
https://learn.microsoft.com/en-us/azure/backup/backup-azure-arm-restore-vms
Recommended Order of Actions
Start with Step 1 it handles the majority of kernel panic scenarios automatically and is the least invasive option.
The data on the disk is safe throughout all these steps.
Hope this gets you back online quickly!
Thanks,
Manish.