If you’ve ever taken a snapshot in Microsoft Hyper-V, Nutanix, or VMware, you will probably notice that the differencing disk it creates grows in size much faster than a normal disk would. One of our clients was struggling to understand why that could be the case so we provided them with this explanation.

Here are some quick Shapshot / Checkpoint background facts that explain the process.

Why a Snapshot is NOT a Backup:

If you don’t understand why a Snapshot is not a Backup, read our simple explanation HERE, but, put simply, when a snapshot (what Microsoft HyperV calls a Checkpoint) is taken, the original disk is set to READ ONLY and a new temporary disk is created that contains any file WRITES.

Differencing Disk File Names:

  • Microsoft Hyper-V: Hyper-V Checkpoints use differencing disks with the file extension .avhdx.
  • Nutanix: Nutanix uses shadow clones for their differencing disks, but they don’t have a specific file extension like AVHDX.
  • VMware: VMware uses linked clones for their differencing disks, and the file extension for these differencing disks is .vmdk. When a snapshot is taken, a -delta.vmdk file is created to store the changes.

Why Snapshots / Checkpoints Grow So Fast:

checkpoint or snapshot differencing disk avhdx

Nutanix, VMWare and HyperV differencing disks (.AVHDX’s in HyperV) from grow in size MUCH faster than an original source disk (.VHDX) because:

  1. OVERWRITE EXISTING SPACE
    If a user were to open a file and make even a tiny change, then save the file, and that “write” was allowed to be written to the original disk, it would effectively overwrite the existing “block” of data thereby not consuming any notable additional space. Contrast that to when a differencing disk is in use; it will need to consume new disk space as the file in question (or more accurately the “block” that changed) because the files does not exist on the differencing disk yet.
  2. BLOCK ALLOCATION TABLE FOR TRACKING
    Metadata about the changes are stored in a Block Allocation Table/Bitmap (BAT) which keeps track of which blocks in the parent disk have been modified and are now stored in the differencing disk. You can think of this as a log, but it really isn’t.
    • NOTE: While BAT on a differencing disk could be several times the size of a BAT on a parent disk, it will still be quite small and is unlikely to exceed 1GB for every 1TB of data. We mention the BAT here mostly to explain how differencing disks work.

While those are the two big reasons, there are some lesser things to consider:

  • Frequent Changes: If the virtual machine experiences frequent changes, the differencing disk will grow quickly as it needs to store all the modified blocks.
  • Large Files: Modifying large files can cause significant growth in the differencing disk, even if only small portions of the file are changed.
  • Application Behavior: Some applications may write data in a way that causes more blocks to be modified, leading to faster growth of the differencing disk.
  • Disk Fragmentation: Over time, fragmentation can cause the differencing disk to grow as it needs to allocate additional space for fragmented data.

What Happens When a Snapshot / Checkpoint is Deleted?

When a Hyper-V checkpoint or a Nutanix / VMWare Snapshot is deleted, the system merges the differencing disk (AVHDX file in the case of HyperV) back into the parent disk (VHDX file in the case of HyperV) by copying only the changed blocks. Here’s how it works:

  • Block-Level Copy: During the merge process, The hypervisor (Hyper-V in Microsoft’s case) identifies the blocks that have changed since the checkpoint was created. It then copies these changed blocks from the differencing disk to the parent disk. This ensures that only the modified portions are merged, rather than the entire file.
  • Sequential Merge: If there are multiple checkpoints, the merge process occurs sequentially. Each differencing disk is merged into its parent disk one by one until all changes are consolidated into the original VHDX file.
  • Efficient Merge: The merge process is efficient because it operates at the block level, minimizing the amount of data that needs to be copied. This helps reduce the time and resources required for the merge.
  • No Log Replay: Hyper-V does not replay a log of actions. Instead, it directly copies the delta changes (modified blocks) from the differencing disk to the parent disk.
  • System Performance: During the merge process, the system may experience increased disk I/O activity, which can impact performance. It’s important to ensure that the system remains stable and that no other intensive operations are running during the merge.

This approach ensures that the virtual machine’s state is accurately preserved while optimizing the merge process.

How Long Can You Keep a Differencing Disk?

This question comes up all the time when we talk to clients and new techs.

Is is not at all uncommon for a customer to say they want to keep their snapshots for a week or two. However, if you think about it there is just no way a company is going to roll back a week’s worth of changes, so it is generally illogical to keep the snapshot for more one day doesn’t make sense.

Here is our real world experience notes:

  1. 98% of the time we keep delete (aka, merge) snapshots after patching the following day at about noon. That gives the customer a few hours in the morning to validate all is well.
  2. VMWare’s official stance is “do not use a single snapshot for more than 72 hours
  3. Performance degrades after three snapshots on the same disk, so don’t keep multiple snapshots of the same drive for very long
  4. You cannot expand a disk (in HyperV, VMware, or Nutanix) if it has a snapshot on it, so you want to get rid of snapshots quickly
  5. Domain Controllers, SQL, Exchange Servers will never (well, almost never) be rolled back using a snapshot so most techs will tell you not to even bother taking a snapshot of these. However, we do snap those types of machines, incase something goes wrong and we want to create NEW virtual machine from the pre-snap hard drive so we can grab a file or two from it.


0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *