Matt
Matt

Reputation: 27001

VM does not boot - "Content ID mismatch" in machine log. How can I fix this?

What I discovered recently is that it can be dangerous to mount a virtual disk (a *.VMDK file) in VMWare from outside of the VM.

Description of the scenario: I've created a new Virtual Machine (let's call it VM B) where I booted from an ISO image containing an imaging tool (such as Parted Magic) which readed from such a virtual disk (belonging to a different Virtual Machine, named VM A) in order to write the content to a different external physical disk. That virtual disk file was only mounted and I did not write any changes to its structure.

enter image description here

Issue: After I have shut down the VM B and tried to boot up the other VM A, to my surprise it did not boot any more. Looking into the machine's logfile (vmware.log), I found a "Content ID mismatch" error in VM A's log.

What happened? How can I fix this?

Upvotes: 0

Views: 3419

Answers (1)

Matt
Matt

Reputation: 27001

Here's some background (the information provided is valid for VMWare Workstation 9. If you're using a different version, it is not guaranteed that it is fully applicable):

VMWare creates for each snapshot a new VMDK file. Inside those files, there is a structure called disk descriptor, which looks as follows:

# Disk DescriptorFile
version=1
CID=acd1cf11
parentCID=ffffffff
createType="vmfs"

When such a VMDK file is accessed, then VMWare creates a new unique CID and stores it inside. Every VMDK file has a parentCID, the first one is marked with ffffffff, which tells VMWare, that there is no further parent existing. Every subsequent parentID refers to an earlier created parent, so it is building up a linked list this way.

The problem I've encountered is, if you add a VMDK file which belongs to a different virtual machine to another one, and it is mounted, a new CID is created and the linked list can break.

In such a case, the virtual machine where this VMDK file belongs to does no longer boot and you have a problem.

Example:

---------- C:\DATA\VMWARE\VMWINDOWS7X64.VMDK
CID=acd1cf11
!OK! parentCID=ffffffff

---------- C:\DATA\VMWARE\VMWINDOWS7X64-000003.VMDK
CID=e54a0beb
!OK! parentCID=627c6ec2

---------- C:\DATA\VMWARE\VMWINDOWS7X64-000002.VMDK
CID=627c6ec2
?NOTFOUND? parentCID=b43e1a6f

---------- C:\DATA\VMWARE\VMWINDOWS7X64-000006.VMDK
CID=0454e2f0
!OK! parentCID=8f139197

---------- C:\DATA\VMWARE\VMWINDOWS7X64-000001.VMDK
CID=92718e8b
?NOTFOUND? parentCID=ad44b86e

[...]

You can see, where I've added !OK! in the line of the parentCID, that there exists a parent VMDK with a valid CID. But in the complete, real case there are 2 not found in total (marked as ?NOTFOUND? in this example). There exists an article at VMWare describing this issue ("The parent virtual disk has been modified since the child was created" error (1007969)). I have tried to mount earlier versions of the VMDK files, but didn't succeed because the first file in the chain, VMWINDOWS7X64.VMDK, isn't referenced any more by any of the other files.

Editing those files as suggested by the VMWare article mentioned above with a text editor is not an option, because

  1. I would not touch a binary file with a text editor
  2. I haven't found an editor which can handle such large files in Windows.

I have seen some articles in Stackoverflow asking for find and replace code in C#, but found no satisfying answer yet. Then I found a hex editor which can handle large binary files for Windows, its name is HxD. It is free for commercial and non-commercial use.

With this editor you can locate the structure mentioned in the question above, patch the CIDs and repair the broken chain, e.g.:

VMDKExample

Actions:

  • Backup the entire virtual machine. Work on a copy.
  • Use the list of CIDs and parent CIDs created earlier (as the one shown in the question)
  • Update it so each parentCID directs to the appropriate CID (the numbering of the snapshot files might help here but in my case I found that the snapshot numbers are not always ascending. Take the parentFilenameHint contained in the VMDK file in the first place, if the order is still not clear sort the VMDK files descending by date time in the explorer - the oldest is the root file (parentCID=ffffffff). It turned out that this was almost the order you get when you look at the linked list of CIDs).
  • Patch each VMDK file using the editor, but patch only those CIDs/parentCIDs which don't fit already
  • Try to boot the VM. If it succeeds, you are lucky and have repaired it. If it doesn't boot, remove the virtual disk and try it with another one in the chain. In my case, after some tries it accepted the file VMWINDOWS7X64-000003.VMDK and booted correctly. That means that not all snapshots where recovered, but the machine was working again. See also the notes below, I was finally able to fully repair it.

The fix that finally worked was (changes marked with !FIX!):

---------- C:\DATA\VMWARE\VMWINDOWS7X64-000003.VMDK
CID=e54a0beb
!OK! parentCID=627c6ec2

---------- C:\DATA\VMWARE\VMWINDOWS7X64-000002.VMDK

CID=627c6ec2
!FIX! parentCID=92718e8b

---------- C:\DATA\VMWARE\VMWINDOWS7X64-000001.VMDK
CID=92718e8b
!FIX! parentCID=acd1cf11

---------- C:\DATA\VMWARE\VMWINDOWS7X64.VMDK
CID=acd1cf11
!OK! parentCID=ffffffff

Notes:

  • There is no guarantee that patching the CIDs will bring back the VM to a working state.
  • The scenario assumes that you haven't written into the VMDK the snapshots are relying on.
  • It can be that you find several ways to restore the linked list of CIDs - write them all down and then try them (remember: you have a backup!), if you are lucky you find the right one and then you're able to fully recover everything. In my case there've been two possible ways to re-create the linked list. One allowed me to partly recover, the other one finally restored everything as it was before the issue occured.
  • I recommend that after a successful fix you should create a full clone from the current state of the virtual machine in a fresh directory. This provides you with clean, consistent files.

Upvotes: 1

Related Questions