Hi
Have you ever used tools like Testdisk, Photorec, scalpel or commercial recovery tools against a large VMFS-datastore ?
If yes - you probably noticed that this tools are slow.
When I started using this tools about 10 years ago we were dealing with VMFS 3 and VMFS 5 and a typical datastore had a size of several hundred GBs up to 2 TB.
Even back in the days scanning a datastore could take half a day and more.
Scanning a datastore that was in production was something that you would only do in very important cases.
Nowadays the size of a virtual disk often is in TB-range and doing raw-scans with tools like scalpel is so time-consuming that you will only consider it in an emergency case.
If you have a 60 TB datastore and search for a lost vmdk the scan can easily take a week - and so you think twice before you would even attempt such a scan.
Anyway - I often must run scans of very large datastores when doing VMFS recovery work so I had to look for new procedures.
As I cant do much about the reading speed - that would require indepth knowledge of ESXi or Linux internals - stuff that I dont know enough about to hope for significant performance boosts.
So the approach I used was to radically simplify and reduce the items that my scans must detect.
When a tool like Testdisk reads every kb of the datastore and checks that location against a database of hundreds of different signatures it is obvious that it will be very slow ... and will need days to scan a 60 TB datastore.
To spare you the boring details let me cut it short and give you the summary:
Task: find the first MB of a regular flat.vmdk on a 8 TB datastore that may still be in production use.
Known tools would do that in a day on fast hardware and in non-ideal conditions that time could go up to half a week easily.
Last week I made a test of my new approach using a nested ESXi 6.7 VM with a 8TB datastore.
Result: 9 objects from 12 known existing ones located in a 2 step procedure that required about 10 minutes to scan the first 3-4 gb.
Once I have those results a second scan reads the 8TB in just 2 minutes !!!
That is very very fast !
Right now the procedure finds flat.vmdks that are partitioned with MBR or GPT - the second is more reliable as GPT boot sectors are standardized.
At the moment this procedure looks very promising and I will explain details soon.
Now I have a couple of questions for you guys ....
1. I will need to do real-live tests using production ESXis soon and are looking for test-objects. If you have very large VMFS 6 datastores and are interested please contact me.
2. Now I only search for flat.vmdks - and I would like to have some input on other large objects that are stored inside production VMFS 6 datastores.
Such other large objects could be used by VSAN , VVols or any other new feature or product of VMware that I have not seen yet.
Large objects mean that the object is typically at least 1 GB in size ...
If you have any suggestions please let me know.
3. Detecting flat.vmdks ...
If you want to know if I would detect your large flat.vmdk here is how you can check:
dd if=flat.vmdk of=/tmp/test.bin bs=1M count=1
hexdump -C /tmp/test.bin | grep " 00 00 55 aa"
hexdump -C /tmp/test.bin | grep " 45 46 49 20 50 41 52 54"
If first grep command matches - I list the location as a promising MBR-formatted candidate.
If both grep commands match - I list the location as a promising GPT-formatted candidate.
If you use valuable flat.vmdks that do not get detected and dont use very exotic stuff please let me know.
Thats all for now - looking for interesting feature requests and large objects that are widely used in production environments.
Ulli
Did you ever make any progress on this? I'm having the same situation right now. I lost a flat.vmdk file while migrating a drive from one VM to another. I still have the .vmdk descriptor file and I'm trying to scan the VFMS6 Datastore... Such a long time! If you had any advice I'd really appreciate it. Thanks in advance.
I'm using a Dell R540 Server with 12x10TB drives. ESXi 6.7 as the host. I have a 40TB NAS attached for the recovery if I can get the files back. I'm looking for a 10TB drive that has all photos on it.