http://vsphere-land.com/vsphere-links/vsan-links.html

VSAN how to

http://tsmith.co/2014/vsan-installation/

Step 1: Enable the VSAN service on a VMKernel port.

* Create your virtual switches, and create at least 1 VMKernel port that will be * vmk-vsanused for VSAN traffic. * Be sure to keep the Switch names the same across hosts! * Edit the VMKernel port and check the box for Virtual SAN Traffic. * Save the port settings, repeat for each host. * Time Saver: Use host profiles!

Step 2: Enable VSAN on the Cluster

Step 3: Add Disks to Disk Groups

Since I chose Manual mode, I will need to add my disks into Disk Groups. A Disk Group is a collection of 1 SSD, and multiple HDD drives. You can have multiple disk groups per host if capacity allows.

Alternatively, select each host, and manually create Disk Groups per host.

Step 4: Start building Virtual Machines!

Yes, it is that easy. You now have a datastore called vsanDatastore.

My DL360 G6 server can also access this datastore, since I enabled VSAN on the VMKernel port groups, even though it’s not providing any resources to the VSAN cluster.

Supported network topologies for VSAN stretched cluster

http://cormachogan.com/2015/09/10/supported-network-topologies-for-vsan-stretched-cluster/

故障域构造

您必须至少定义三个故障域,每个故障域可能包含一个或多个主机。故障域定义必须确认可能代表潜在故障域的物理硬件构造,如单个计算机柜。

如果可能,请使用至少四个故障域。使用三个故障域时,不允许使用特定撤出模式,Virtual SAN 也无法在故障发生后重新保护数据。在这种情况下,您需要一个使用三域配置时无法提供的备用容量故障域用于重新构建。

如果启用故障域,Virtual SAN 将根据故障域而不是单个主机应用活动虚拟机存储策略。

根据计划分配给虚拟机的存储策略中规定的“允许的故障数”属性,计算群集中的故障域数目。

number of fault domains = 2 * number of failures to tolerate + 1

如果主机不是故障域成员,Virtual SAN 会将其解析为单独的域。 

http://cormachogan.com/2015/04/20/vsan-6-0-part-8-fault-domains/

If rack 1 fails (containing host 1), do I still have a full copy of the data? The answer is Yes.

If rack 2 fails (containing host 2), do I still have a full copy of the data? The answer is Yes.

If rack 3 fails (containing hosts 3 & 4), do I still have a full copy of the data? The answer is still Yes.

删除磁盘原有分区

使用partedUtil工具 首先确认哪些磁盘被确认出来。

esxcli storage core device list
# 获取 ID
partedUtil get /vmfs/xxx/xxx
1 2048 3xxx 0 0

# 删除分区
partedUtil delete /vmfs/xxx/xxx 1

MTU setting

Dell switches a MTU 9000 is actually (9 * 1024) = 9216. Ugh, so now I have set

BAM! The VSAN datastore is rocking and I can now write to it. However, this scenario posed some reflection on how MTU actually impacts VSAN. I did some further testing and came up with the follow conclusions:

In conclusion, would I advise configuring Jumbo Frames with VSAN? No. Unless you’re the type who prefers all risk and no reward…

http://flcloudlabs.com/vsan-and-mtu/

https://communities.vmware.com/message/2455828

请遵循以下准则

Virtual SAN 需要一个专用 1 Gb 网络。最佳做法是使用 10 Gb 网络。

在每台主机上,可至少将一个物理 1 Gb 以太网网卡专用于 Virtual SAN。还可以置备另外一个物理网卡作为故障切换网卡。

可以在每个主机上使用 vSphere 标准交换机,或者可以将环境配置为使用 vSphere Distributed Switch。

为每个用于 Virtual SAN 的网络配置一个已激活 Virtual SAN 端口属性的 VMkernel 端口组。

为每个端口组使用相同的 Virtual SAN 网络标签,并确保这些标签在所有主机上一致。

使用巨帧以实现最佳性能。

Virtual SAN 支持 IP 哈希负载平衡,但无法保证所有配置的性能都有提升。当除 Virtual SAN 以外还有众多 IP 哈希使用者时,可以从 IP 哈希中获益。这种情况下,IP 哈希将执行负载平衡。但是,如果 Virtual SAN 是唯一的使用者,则可能看不到什么变化。这一规则特别适用于 1G 环境。例如,如果您将四个设置了 IP 哈希的 1G 物理适配器用于 Virtual SAN,实际能够使用的可能不超过 1G。对于我们目前支持的所有网卡成组策略来说,这一点也同样适用。有关网卡成组的详细信息,请参见《vSphere NetworkingvSphere 网络》指南的“网络连接策略”部分。

Virtual SAN 不支持同一子网上有多个 VMkernel 适配器用于负载平衡。但是支持多个 VMkernel 适配器位于不同网络的情况,如 VLAN 或单独的物理结构。

您应该将所有参与 Virtual SAN 的主机连接到已启用多播(IGMP 侦听)的单个 L2 网络。如果参与 Virtual SAN 的主机跨越多个交换机乃至 L3 边界,必须确保将网络正确配置为启用多播连接。如果您的网络环境需要,或者如果您在同一 L2 网络中运行多个 Virtual SAN 群集,则可以更改多播地址的默认设置。 

http://www.tomsitpro.com/articles/essential-virtual-san-vsan-book-excerpt,2-888.html

http://pubs.vmware.com/vsphere-60/index.jsp?topic=%2Fcom.vmware.vsphere.virtualsan.doc%2FGUID-D52F00FF-CA2C-4DDD-B76B-B8BF211BB0E8.html

Network Misconfiguration Status in a Virtual SAN Cluster

After you enable Virtual SAN on a cluster, the datastore is not assembled correctly because of a detected network misconfiguration. Problem

After you enable Virtual SAN on a cluster, on the Summary tab for the cluster the Network Status for Virtual SAN appears as Misconfiguration detected. Cause

One or more members of the cluster cannot communicate because of either of the following reasons:

How do you resolve it? Well, a number of our VSAN beta customers discussed some options on the community, and these were the recommendations:

http://cormachogan.com/2014/01/21/vsan-part-15-multicast-requirement-for-networking-misconfiguration-detected/

Changing the multicast address used for a VMware Virtual SAN Cluster (2075451)

Purpose

This article provides steps to change the multicast address for each VMware Virtual SAN cluster. If there are multiple Virtual SAN clusters on the same Layer 2 network, each host receives all multicast messages. In order to reduce the amount of multicast traffic for each VSAN cluster, it is necessary to change the multicast address for each VMware Virtual SAN cluster.

Warning: If you change the multicast address on an active Virtual SAN cluster, it can lead to network partitioning until all of the ESXi hosts in the cluster are on the same multicast network. It is recommended to organize downtime before making this change.

Resolution

In order to change the multicast address for VMware Virtual SAN, perform these steps on each ESXi host within the Virtual SAN Cluster.

To change the multicast address on an ESXi 5.5/6.0 host configured for Virtual SAN:

Using tcpdump-uw to collect packet traces to troubleshoot network issues

Usage: tcpdump-uw
tcpdump-uw
-i = interface
-n = no IP or Port name resolution
-s0 = Collect entire packet
-t = no timestamp
-c = number of frames to capture

Generate Multicast traffic

nc -uz <destination-ip> <destination-port>

== Monitor VSAN VMKernel Port network traffic ===

esxcli network ip connection list

Troubleshooting

http://www.virten.net/2014/01/manage-vsan-with-rvc-part-4-troubleshooting/

No space left on device

http://www.m80arm.co.uk/2013/12/ha-issues-with-vsan-beta-refresh.html

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1007638

 stat -f /vsanDatastore

* The no space left on the device error confused me so in order to rectify the issue I tried the following:

  1. Increases the size of the VMDK used for the ESXi installation from 2GB to 5GB
  2. Inflated the ESXi installation VMDK so it was Thick Eager Zero'd just in case this was causing any strange issues
  3. Rebuild the nested environment manually (not using the .ova supplied by William Lam)

=== Docker Container for the Ruby vSphere Console (RVC) ==

docker pull lamw/rvc
docker run --rm -it lamw/rvc
docker run --rm -it -p 80:8010 lamw/rvc

http://www.virtuallyghetto.com/2015/11/docker-container-for-the-ruby-vsphere-console-rvc.html

System logs are stored on non-persistent storage

To verify the location:

Browse to the host in the vSphere Web Client navigator.

Note: You must reboot the host for the changes to take effect.

Note: To log to a datastore, the Syslog.global.logDir entry should be in the format of [Datastorename]/foldername. To log to the scratch partition set in the ScratchConfig.CurrentScratchLocation, the format is blank or []/foldername.

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2032823

How to Delete VSAN Datastore

http://www.vladan.fr/how-to-delete-vsan-datastore/

VSAN upload problem

2015-11-19T19:05:08.31Z DEBUG vsan-health[Thread-7] [VsanHealthServer::do_GET] In do_GET: ('127.0.0.1', 36677)
2015-11-19T19:05:08.31Z WARNING vsan-health[Thread-7] [VsanHealthServer::do_GET] do_GET: isStringResponse = True
2015-11-19T19:05:08.32Z INFO vsan-health[Thread-7] [VsanHealthServer::log_message] ('127.0.0.1', 36677) - - "GET /vsanHealth/health HTTP/1.1" 200 -
2015-11-19T19:05:08.32Z DEBUG vsan-health[Thread-7] [VsanHealthServer::do_GET] Done do_Get: ('127.0.0.1', 36677) (took 0.0)
2015-11-19T19:05:35.507Z WARNING vsan-health[Thread-1] [VsanPyVmomiProfiler::InvokeMethod] Invoke: mo=ServiceInstance, info=CurrentTime
2015-11-19T19:05:38.475Z DEBUG vsan-health[Thread-7] [VsanHealthServer::do_GET] In do_GET: ('127.0.0.1', 36677)
2015-11-19T19:05:38.475Z WARNING vsan-health[Thread-7] [VsanHealthServer::do_GET] do_GET: isStringResponse = True
2015-11-19T19:05:38.475Z INFO vsan-health[Thread-7] [VsanHealthServer::log_message] ('127.0.0.1', 36677) - - "GET /vsanHealth/health HTTP/1.1" 200 -
2015-11-19T19:05:38.476Z DEBUG vsan-health[Thread-7] [VsanHealthServer::do_GET] Done do_Get: ('127.0.0.1', 36677) (took 0.0)

Remove a disk group FROM a host

Entering Maintenance Mode is done by selecting the correct ESXi host and then clicking on the maintenance mode icon in the Disk Management section on Virtual SAN in the vSphere web client (third icon from the left):

Shutdown VSAN cluster

To recap, if shutting down the whole of the VSAN cluster, use maintenance mode for the hosts, and do not move VMs or migrate any data.