Cause of this Error?

Proxmox VM migration failed with error “Permission denied (publickey, password)” is because previously I have 4 nodes in my Proxmox cluster. To work on some testing I have removed three of the physical nodes. Once the testing completed this is the time I need to join those back to Proxmox cluster.

Before starting with adding the nodes they are rebuilt from scratch. While trying to add the nodes it works without any issue. However, the ssh authorized keys are appended below the existing keys, So this makes a conflict. Right after removing the offending key adding the new key with alias name will resolve this issue.

Proxmox VM migration failed
Proxmox VM migration failed with publickey,password

The Actual Error

While trying to migrate a VM from graphical console we will get the below error.

root@192.168.0.14: Permission denied (publickey,password).
TASK ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve4' root@192.168.0.14 pvecm mtunnel -migration_network 192.168.0.11/24 -get_migration_ip' failed: exit code 255
Proxmox VM migration Permission denied
Proxmox VM migration Permission denied Output Tab
Proxmox VM migration Permission denied Status Tab
Proxmox VM migration Permission denied Status Tab

First verify whether your DNS is resolving on first node in your Proxmox cluster.

# cat /etc/resolv.conf

Or from GUI under Datacenter –> pve1 –> DNS –> DNS entry should be in right side pane.

Proxmox DNS entry
Proxmox DNS entry

In my case, DNS entry is missing. I have added it and took a restart for networking service.

Remove the offending key

SSH to the destination server to remove the duplicate offending key.

# ssh root@192.168.0.14

Remove the old authorized keys from .ssh/authorized_keys

root@pve4:~# vim .ssh/authorized_keys

Save and exit using wq! option.

Copy the new Key

From the first PVE node copy the key using -o option to provide a alias name for the specific IP.

root@pve1:~# ssh-copy-id -o 'HostKeyAlias=pve4' root@192.168.0.14
root@pve1:~# ssh-copy-id -o 'HostKeyAlias=pve4' root@192.168.0.14
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@pve4's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh -o 'HostKeyAlias=pve4' 'root@192.168.0.14'"
and check to make sure that only the key(s) you wanted were added.
root@pve1:~#

Verify SSH Connection

Now verify the SSH connection once again.

root@pve1:~# ssh root@pve4
Linux pve4 5.4.34-1-pve #1 SMP PVE 5.4.34-2 (Thu, 07 May 2020 10:02:02 +0200) x86_64
The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
Last login: Sat Aug 1 12:47:26 2020 from 192.168.0.11
root@pve4:~#

Migrate and Verify

Try to migrate a VM from your PVE1 to PVE4. My remaining nodes will be done after this.

Promox VM migration success
Promox VM migration success

Truncated the long output.

2020-08-01 12:53:11 use dedicated network address for sending migration traffic (192.168.0.14)
2020-08-01 12:53:11 starting migration of VM 109 to node 'pve4' (192.168.0.14)
2020-08-01 12:53:11 found local disk 'local-lvm:vm-109-disk-0' (in current VM config)
2020-08-01 12:53:11 copying local disk images
2020-08-01 12:53:11 starting VM 109 on remote node 'pve4'
2020-08-01 12:53:13 start remote tunnel
2020-08-01 12:53:13 ssh tunnel ver 1
2020-08-01 12:53:13 starting storage migration
2020-08-01 12:53:13 scsi0: start migration to nbd:unix:/run/qemu-server/109_nbd.migrate:exportname=drive-scsi0
drive mirror is starting for drive-scsi0 
all mirroring jobs are ready 
2020-08-01 12:56:18 volume 'local-lvm:vm-109-disk-0' is 'local-lvm:vm-109-disk-0' on the target
2020-08-01 12:56:18 starting online/live migration on unix:/run/qemu-server/109.migrate
2020-08-01 12:56:18 set migration_caps
2020-08-01 12:56:18 migration speed limit: 8589934592 B/s
2020-08-01 12:56:18 migration downtime limit: 100 ms
2020-08-01 12:56:18 migration cachesize: 536870912 B
2020-08-01 12:56:18 set migration parameters
2020-08-01 12:56:18 start migrate command to unix:/run/qemu-server/109.migrate
2020-08-01 12:56:30 migration speed: 20.79 MB/s - downtime 27 ms
2020-08-01 12:56:30 migration status: completed
drive-scsi0: transferred: 21475295232 bytes remaining: 0 bytes total: 21475295232 bytes progression: 100.00 % busy: 0 ready: 1 
all mirroring jobs are ready 
drive-scsi0: Completing block job...
drive-scsi0: Completed successfully.
drive-scsi0 : finished
2020-08-01 12:56:31 # /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=pve4' root@192.168.0.14 pvesr set-state 109 \''{}'\'
2020-08-01 12:56:32 stopping NBD storage migration server on target.
  Logical volume "vm-109-disk-0" successfully removed
2020-08-01 12:56:36 migration finished successfully (duration 00:03:26)
TASK OK

That’s it, It works. We have resolved the Proxmox VM migration failed with Permission denied (publickey,password).

Conclusion

In case if we reinstalled anyone of the node in a Proxmox cluster, we need to make sure to clear the old authorized keys and use the correct alias of nodes. Once we copy over the key from the master node to other nodes with alias it fixes the Proxmox VM migration failed issue. Subscribe to newsletter for more troubleshooting guides.

LEAVE A REPLY

Please enter your comment!
Please enter your name here