Server Fault Asked on December 9, 2021
We have a 4 host ESXi 6.5 cluster with DRS fully automated. When checking the history, we see a specific (big) VM (6 CPUs, 64 GB mem) having roughly 10 vMotions by DRS per day. Someone from the team claims we should make DRS less aggressive and exclude this big machine from DRS.
But I’m wondering, what’s the point of that? Can’t we just let DRS do its job since a vMotion should have no impact on guest and cluster performance? I’d like to have some arguments to tell him not to make things too complex by applying exclusions and so on.
First and foremost, the logic behind why DRS moves something is very complicated, so trying to figure out why it does something is usually the path to madness.
That being said, lowering the aggression setting is what's usually done when DRS is a bit too trigger-happy, unless there's some other obvious underlying issue, like a VM being too close to the maximum configuration of a host (VMware isn't a very happy camper if you assign 90% of host resources to a single VM). The aggression setting doesn't really matter that much, DRS will still kick in regardless if any host becomes too congested, it'll just be less aggressive, obviously. As I stated above, due to so many factors being considered by DRS, the aggression setting isn't really comparable between different environments, usually 3 is a good starting point, but some environments need it to be dropped down a notch or two.
Exclusions are a bit of a different beast, they are best reserved for VMs that don't take too kindly to being moved. An example is hot-standby software that checks if it's peer is online very frequently, I've seen applications that starts to fail over if the hot peer is unresponsive for more than a millisecond. Another application for exclusions are VMs that you want to stay put, a good example is when you have a stretched cluster over multiple datacenters. Then it makes sense to exclude your domain controllers from DRS and manually place them on certain hosts in certain datacenters, so that DRS doesn't get too clever and place them all in the same datacenter.
Answered by Stuggi on December 9, 2021
You're moving tens of GB of RAM via network from 1 host to another so you DO have an impact. I would strongly recommend lowering the aggression of the DRS. You gain nothing by moving VMs 10 times a day; DRS will help you get to an overall balanced load in the cluster and then somewhat maintain it when you create new VMs (you will get a recommended target host). It will also re-balance the cluster when there are larger discrepancies between the hosts.
Answered by Fatman on December 9, 2021
vMotions do have a small impact on the cluster, it eats up a bit of hypervisor time and uses network bandwidth obviously too - but generally speaking leaving it on makes sense, but if you want to lower the aggression then that's fine too. I'm wondering that given the VM's resource requirements maybe it moving around a decent amount means you need more CPU and/or memory? Also why have you not moved to 6.7 yet?
Answered by Chopper3 on December 9, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP