ESXi
Beacon Probing Deep-Dive
With 3 or more uplinks in a team, we can pin point failures of a single uplink. With 2 uplinks in a team, we can detect downstream link failure, but we don't know which one is good and which bad.
- ESXi with two NICs
If you have three or more NICs, the mechanism is pretty straight-forward, but what happens if you have only two NICs?
In this situation, you cannot determinate if the problem is caused by the sending NIC ort he receiving NIC.
There is a fallback mechanism that will be used in this situation. Since ESXi does not know which of the NIC is affected (but it knows that one of the NIC is affected), it will simply starts to duplicate the frames on both NICs
iSCSI
esxcli iscsi adapter discovery sendtarget list
Grub boot ESXi6.5u2
- grub.cfg add ESXi6
set root=(hd0,gpt1) search --set=root --file /efi/VMware/safebt64.efi chainloader /efi/VMware/safebt64.efi
Build Customizer ISO
https://www.v-front.de/p/esxi-customizer-ps.html#download
ESXi-Customizer-PS-v2.6.0.ps1 -v65 -pkgDir c:\net-r816X -izip update-from-esxi6.5-6.5_update02.zip
Show Memory
smbiosDump | grep -A 12 -B 1 'Location: "DIMM' | egrep 'Location:|Bank:|Part Number:|Size:|Speed:|--'
ESXi 5.0
show arp
esxcli network ip neighbor list
show FRU
Get the source from sourceforge & unpack it
# curl "http://liquidtelecom.dl.sourceforge.net/project/ipmitool/ipmitool/1.8.15/ipmitool-1.8.15.tar.bz2" > ipmi.tar.gz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 729k 100 729k 0 0 260k 0 0:00:02 0:00:02 --:--:-- 260k# bzip2 -dc ipmi.tar.gz | tar -xf -
compile
# cd ipmitool-1.8.15/ # ./configure CFLAGS=-m32 LDFLAGS=-static checking build system type... x86_64-unknown-linux-gnu checking host system type... x86_64-unknown-linux-gnu . . . #make
Scp the file and run tool from ESX host
# scp src/ipmitool root@192.168.39.12:~
RUN IT
esxcfg-scsidevs -l | egrep -i 'display name|vendor|size' | head -20 /vmfs/volumes/view-2/ipmitool fru | head
References
To power on a virtual machine from the command line:
List the inventory ID of the virtual machine with the command:
vim-cmd vmsvc/getallvms |grep <vm name>
Note: The first column of the output shows the vmid.
Check the power state of the virtual machine with the command:
vim-cmd vmsvc/power.getstate <vmid>
Power-on the virtual machine with the command:
vim-cmd vmsvc/power.on <vmid> for vmid in `vim-cmd /vmsvc/getallvms | tail -n+2 | awk '{print $1}' | grep -o '[0-9]*'`; do vim-cmd vmsvc/power.getstate $vmid; done
Troubleshooting
ESXi network card driver
esxcli network nic get -n vmnic0
scratch warning
- Creating a persistent scratch location for ESXi 4.x/5.x/6.x
# mkdir /vmfs/volumes/VNX5700_NLSAS_EsxiLog_LUN8/192.168.201.214/scratch
[root@esxi214:~] vim-cmd hostsvc/advopt/view ScratchConfig.ConfiguredScratchLocation
[root@esxi214:~] /bin/vim-cmd hostsvc/advopt/update ScratchConfig.ConfiguredScratchLocation string /vmfs/volumes/VNX5700_NLSAS_EsxiLo
g_LUN8/192.168.201.214/scratch
[root@esxi214:~] vim-cmd hostsvc/advopt/view ScratchConfig.ConfiguredScratchLocation
(vim.option.OptionValue) [
(vim.option.OptionValue) {
key = "ScratchConfig.ConfiguredScratchLocation",
value = "/vmfs/volumes/5e63c57f-a1fbc30f-6ed8-1866daf9bcd6/192.168.201.214/scratch"
}
]
force umount device
[root@esxi211:~] df -h | grep VNX5700_NLSAS_APPs_LUN8 VMFS-6 0.0B 0.0B 0.0B 0% /vmfs/volumes/VNX5700_NLSAS_APPs_LUN8 [root@esxi211:~] esxcli storage vmfs extent list | grep VNX5700_NLSAS_APPs_LUN8 VNX5700_NLSAS_APPs_LUN8 5e4fb6bd-d067ca78-7d9c-e04f43074bfc 0 naa.6006016011103300fca6fbf39554ea11 1 [root@esxi211:~] esxcli storage core device set --state=off -d naa.6006016011103300fca6fbf39554ea11 Unable to set device's status. Error was: Vmfs volume mounted on device with name naa.6006016011103300fca6fbf39554ea11 [root@esxi211:~] esxcli storage core device set --state=off -d naa.6006016011103300fca6fbf39554ea11 --force Unable to set device's status. Error was: Unable to change device state, the device is marked as 'busy' by the VMkernel.: Sysinfo error: BusySee VMkernel log for details.
Monitor a Migration
# vim-cmd vmsvc/getallvms | grep -i vmname # vim-cmd hbrsvc/vmreplica.getState vmid # vim-cmd hbrsvc/vmreplica.queryReplicationState vmid
firmware
Just connect to any ESXi part of the vSAN cluster, via putty, and enter:
esxcli vsan debug controller list
Match Linux SCSI Devices (sdX) to Virtual Disks in VMware
$vm="vcsa6.virten.lab"
$vmview = Get-View -ViewType VirtualMachine -Filter @{"Name" = $vm}
foreach ($VirtualSCSIController in ($vmview.Config.Hardware.Device | where {$_.DeviceInfo.Label -match "SCSI Controller"})) {
foreach ($VirtualDiskDevice in ($vmview.Config.Hardware.Device | where {$_.ControllerKey -eq $VirtualSCSIController.Key})) {
Write-Host SCSI" ("$($VirtualSCSIController.BusNumber):$($VirtualDiskDevice.UnitNumber)")" $VirtualDiskDevice.DeviceInfo.Label
}}
First
root># blkid | awk '{ print $1","$2","$3 }' | sed 's#:##' | tee /tmp/disk_info_1.csv
/dev/sda1,UUID="3a2d9a4b-ac76-44e8-824c-1331fa186bf4",TYPE="xfs"
/dev/sda2,UUID="RB4aJL-Db3o-2MJk-1QCV-x5mQ-VZq8-BAQprW",TYPE="LVM2_member"
/dev/sde1,LABEL="NCDBDATA21",TYPE="oracleasm"
/dev/sdf1,LABEL="NCDBDATA22",TYPE="oracleasm"
/dev/sdg1,LABEL="NCDBDATA23",TYPE="oracleasm"
/dev/sdh1,LABEL="NCDBDATA24",TYPE="oracleasm"
/dev/sdj1,LABEL="NCDBDATA26",TYPE="oracleasm"
/dev/sdk1,LABEL="NCDBDATA27",TYPE="oracleasm"
/dev/sdn1,LABEL="NCDBDATA30",TYPE="oracleasm"
/dev/sdo1,LABEL="NCDBDATA31",TYPE="oracleasm"
/dev/sdq1,LABEL="NCDBCRS1",TYPE="oracleasm"
/dev/sdm1,LABEL="NCDBDATA29",TYPE="oracleasm"
/dev/sdl1,LABEL="NCDBDATA28",TYPE="oracleasm"
/dev/sdr1,LABEL="NCDBCRS2",TYPE="oracleasm"
/dev/sdp1,LABEL="NCDBDATA32",TYPE="oracleasm"
/dev/sds1,LABEL="NCDBCRS3",TYPE="oracleasm"
/dev/sdi1,LABEL="NCDBDATA25",TYPE="oracleasm"
/dev/mapper/ol-root,UUID="04e67898-9f0b-4dc4-9801-9b2538db094d",TYPE="xfs"
/dev/mapper/ol-swap,UUID="78826dc4-ea7a-4ba9-ade4-1183521a3403",TYPE="swap"
/dev/mapper/ol-oracle,UUID="71906eb7-6c37-4378-9722-dab9aa5fc82c",TYPE="xfs"
Second
root># ls -d /sys/block/sd*/device/scsi_device/* |awk -F '[/]' '{print $4,"- SCSI",$7}' | awk '{ print "/dev/"$1"1,"$3"_"$4 }' | tee /tmp/disk_info_2.csv
/dev/sda1,SCSI_1:0:0:0
/dev/sdb1,SCSI_4:0:0:0
/dev/sdc1,SCSI_4:0:1:0
/dev/sdd1,SCSI_4:0:2:0
/dev/sde1,SCSI_4:0:3:0
/dev/sdf1,SCSI_4:0:4:0
/dev/sdg1,SCSI_4:0:5:0
/dev/sdh1,SCSI_4:0:6:0
/dev/sdi1,SCSI_4:0:8:0
/dev/sdj1,SCSI_4:0:9:0
/dev/sdk1,SCSI_4:0:10:0
/dev/sdl1,SCSI_4:0:11:0
/dev/sdm1,SCSI_4:0:12:0
/dev/sdn1,SCSI_4:0:13:0
/dev/sdo1,SCSI_4:0:14:0
/dev/sdp1,SCSI_4:0:15:0
/dev/sdq1,SCSI_5:0:0:0
/dev/sdr1,SCSI_5:0:1:0
/dev/sds1,SCSI_5:0:2:0
CVE detail
find lock file/VMDK
[root@localhost:~] lsof | grep Tenable-Core-Nessus-20190624-flat 5067761 vmx FILE 79 /vmfs/volumes/578f35f1-ba1c2571-27b3-a44c11de9a2c/陈美林-Tenable-Core-Nessus-20190624/陈美林-Tenable-Core-Nessus-20190624-flat.vmdk [root@localhost:~] esxcli vm process list | grep -A 4 -B 3 5067761 陈美林_20191009-Tenable-Core-Nessus World ID: 5067762 Process ID: 0 VMX Cartel ID: 5067761 UUID: 42 38 6c 18 18 b4 6d e0-fd 2f f2 b5 80 63 41 26 Display Name: 陈美林_20191009-Tenable-Core-Nessus Config File: /vmfs/volumes/578f35f1-ba1c2571-27b3-a44c11de9a2c/陈美林-Tenable-Core-Nessus-20190624/陈美林-Tenable-Core-Nessus-20190624.vmx # esxcli vm process kill -t force -w 5067762
Recover password
mount /dev/mmcblk0p5 /mnt/zip cd /mnt/zip cp state.tgz /tmp; cd /tmp tar xvf state.tgz tar xvf local.tgz cd /tmp/etc
delete old password
- vi shadow
root:$6$haVeTYl9$GonmIzEA3w1ke0uTo1DZn01Nvb5qoqJ/fREzuXDNJw2yAyJFCEu./f/elmIkQ9XxBnCGUbFJ7f1iHkRPp/Qc90:13358:0:99999:7::: nobody:*:13358:0:99999:7::: nfsnobody:!!:13358:0:99999:7::: dcui:*:13358:0:99999:7::: daemon:*:13358:0:99999:7::: vpxuser:$6$a2mQch.xhPePCI/s$hLRwlsDWR5j/QfQGPu9ev8HgKwFyqpIHeHtPFH7j0s7Uzt4DH6XE2U8g.QcJ8EN9OapuiWRd7fyH0lx5ginmA0:18124:0:99999:7:::
cat shadow
cat shadow root::13358:0:99999:7::: nobody:*:13358:0:99999:7::: nfsnobody:!!:13358:0:99999:7::: dcui:*:13358:0:99999:7::: daemon:*:13358:0:99999:7::: vpxuser:$6$a2mQch.xhPePCI/s$hLRwlsDWR5j/QfQGPu9ev8HgKwFyqpIHeHtPFH7j0s7Uzt4DH6XE2U8g.QcJ8EN9OapuiWRd7fyH0lx5ginmA0:18124:0:99999:7:::
cd .. tar czf local.tgz etc tar czf state.tgz local.tgz cp state.tgz /mnt/zip/ cp: overwrite '/mnt/zip/state.tgz'? y
https://www.virten.net/2017/01/vsphere-6-5-component-password-recovery-vcenter-sso-and-esxi/
Capture VM network
[root@localhost] net-stats -l | grep -i oel 50332673 5 9 DvsPortset-0 00:50:56:b0:de:9e 陈美林-OEL6.9_with_OracleDB-11.2.0.4-TEMPLATE.eth0 pktcap-uw --switchport 50332673 --dir 0 --outfile /tmp/50332671_OEL_25.222-in.pcap pktcap-uw --switchport 50332673 --dir 1 --outfile /tmp/50332671_OEL_25.222-out.pcap # For 6.7 or above pktcap-uw --switchport 50332673 --dir 2 --outfile /tmp/50332671_OEL_25.222-INandOUT.pcap
Permanent Device Loss (PDL) and All-Paths-Down (APD) in vSphere 5.x and 6.x (2004684)
Permanent Device Loss (PDL):
A datastore is shown as unavailable in the Storage view
A storage adapter indicates the Operational State of the device as Lost Communication
All-Paths-Down (APD):
A datastore is shown as unavailable in the Storage view.
A storage adapter indicates the Operational State of the device as Dead or Error.To clean up an unplanned PDL:
All running virtual machines from the datastore must be powered off and unregistered from the vCenter Server.
From the vSphere Client, go to the Configuration tab of the ESXi host, and click Storage.
Right-click the datastore being removed, and click Unmount.
The Confirm Datastore Unmount window displays. When the prerequisite criteria have been passed, the OK button appears.
If you see this error when unmounting the LUN:
Call datastore refresh for object <name_of_LUN> on vCenter server <name_of_vCenter> failed
You may have a snapshot LUN presented. To resolve this issue, remove that snapshot LUN on the array side.
Perform a rescan on all of the ESXi hosts that had visibility to the LUN.
Note: If there are active references to the device or pending I/O, the ESXi host still lists the device after the rescan. Check for virtual machines, templates, ISO images, floppy images, and raw device mappings which may still have an active reference to the device or datastore.
If the LUN is still being used and available again, go to each host, right-click the LUN, and click Mount.
Note: One possible cause for an unplanned PDL is that the LUN ran out space causing it to become inaccessible
Unable to open MKS: Internal Error” when opening virtual machine console
https://kb.vmware.com/s/article/2116542
http://noor2122.blogspot.com/2015/12/open-vm-console-unable-to-connect-to.html
Failed to log into NFC server
While looking for causes I found these possibilities:
- Not enough space in the datastore
- Port 901/902 not open between the VI Client to the source of the file (and/or VC Server, could not find if that is also a factor)
- DNS configuration for the host servers
As a possible solution I also found that restarting the VC Server service could help.
Deprecated VMFS volume(s)found on the host. Please consider ....
services.sh restart
how to upgrade
ESXi5.5 to ESXi6.0 issue
- Regarding the VIB VMware_locker_tools-light issue
~ # time esxcli software profile update -d /vmfs/volumes/585e0bc8-cc50bcc0-99dd-1866daecba16/VMware-VMvisor-Installer-6.0.0-2809209.x86_64-Dell_Customized-A02.zip -p Dell-ESXi-6.0.0-2809209-A02 [DependencyError] VIB VMware_locker_tools-light_10.3.10.12406962-14320388 requires esx-version >= 6.6.0, but the requirement cannot be satisfied within the ImageProfile. Please refer to the log file for more details. Command exited with non-zero status 1
mv /store /store.tmp
vim-cmd hostsvc/maintenance_mode_enter
esxcli --server=server_name software vib update --depot=/path_to_vib_ZIP/ZIP_file_name.zip
vim-cmd hostsvc/maintenance_mode_exit
How to upgrade ESXi 6.0 to ESXi 6.5 via Offline Bundle – The Steps:
# esxcli software sources profile list -d /vmfs/volumes/NEW-HX-TEST01/HX-Vmware-ESXi-6.5U1-5969303-Cisco-Custom-6.5.1.1-Bundle.zip Name Vendor Acceptance Level ------------------------------------------------- ------ ---------------- Vmware-ESXi-6.5.0-HX-5969303-Custom-Cisco-6.5.1.1 CISCO PartnerSupported [root@BGY-HX-13:/vmfs/volumes/b897e8ce-f4ee56e5] esxcli software profile update -d /vmfs/volumes/NEW-HX-TEST01/HX-Vmware-ESXi-6.5U1-5969303-Cisco-Custom-6.5.1.1-Bundle.zip -p Vmware-ESXi-6.5.0-HX-5969303-Custom-Cisco-6.5.1.1 Update Result Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective. Reboot Required: true
esxcli software profile update -p ESXi-6.5.0-4564106-standard -d /vmfs/volumes/your_datastore/VMware-ESXi-6.5.0-4564106-depot.zip
Interpreting SCSI sense codes in VMware ESXi and ESX
- SCSI events that can trigger ESX server to fail a LUN over to another path
vim-cmd how to
For HBA information only vmware-vim-cmd hostsvc/summary/hba For datastore information only vmware-vim-cmd hostsvc/summary/fsvolume esxcfg-scsidevs -l
show hba controller
- Dell PERC
/opt/lsi/perccli/perccli /c0 show
Find vmx by displayName
find /vmfs/volumes/BGYBH-HX-PRD/ -iname "*.vmx" -type f -print0 | xargs -0 egrep -il "displayName.*irmq02" 2>/dev/null
NTP
ntpq -p localhost ntpq -pn tcpdump-uw -v -c 5 -n -i vmk0 host ntp_server_ip_address and port 123
ProFTP
https://www.testipenkki.com/?p=96 https://vibsdepot.v-front.de/wiki/index.php/ProFTPD [root@localhost:~] esxcli software vib install --no-sig-check -d /tmp/ProFTPD-1.3.3-8-offline_bundle.zip Installation Result Message: Operation finished successfully. Reboot Required: false VIBs Installed: VFrontDe_bootbank_ProFTPD_1.3.3-8 VIBs Removed: VIBs Skipped: [root@localhost:~] vmware -l VMware ESXi 6.5.0 Update 2
