If you are looking for forensic test images, you have a choice. You can choose from any of the 2TB of test images linked at https://www.dfir.training/resources/test-images-and-challenges/test-images-and-challenges/all or you can make your own.
Here is my opinion on forensic test images. Your mileage may vary.
First choice: Make it myself.
If I am teaching a forensic class, I most always create the images myself. Actually, I only use images that I create myself. It takes time and effort, but I feel strongly enough about it to do it myself. The most important reason is that I know what is in my image. I know how it got there. I know what is not there. And everything about the image is exactly what I wanted.
I have used a publicly available image for a course before, and the result was that students were finding things that I didn’t want them to. Things that didn’t make sense (in that, there was suspicious activity in the image that was worthy of investigation, but not relevant to the course). This is all good for practice but takes away from a training course where you have objectives to achieve for student learning. Plus, with a public image, without knowing the underlying creation of the system, you really don’t know what happened on the system. Sure, you can see what happened and make assumptions, but you never really know what happened.
The same goes for practice/testing that I personally do. I make the images myself. The reason is that when I am testing a tool, I want to know without a doubt as to what the image contains. Otherwise, I am relying on an unvalidated image to validate a tool. By unvalidated image, I mean an image that comes with no documentation about the evidence on the image or how it got there.
Those images created by those creating them with the “answers”, in that they planted the evidence on the system, documented what they did, and then created the image for use by others.
I create images with a documented process so that I know what I am getting when I am finished:
- Write the goal of the image (ie: recover files from the recycle bin)
- Create a small virtual machine
- Start the machine and create the evidence (do the activity)
- Document each thing I do by date/time/activity
- Image the virtual machine
Now I have an image that I know exactly the user activity, because I did it and wrote it down by date/time/activity. When I run a tool over that image to find the user activity and evidence that I have, the results should match the spreadsheet I created.
That is a simple example, but you get the idea. I have made complex images using multiple virtual machines that emailed each other, shared files via P2P/Dropbox/file share sites, VPN use, Tor, and so forth. I have some virtual machines that occasionally I will fire up months apart just to add some evidence on it. Other virtual machines contained only evidence of a deleted folder.
Second choice: Download someone else’s image.
Curiosity gets me all the time. I just have to know. Whatever it is, I just have to know. That goes for publicly available forensic images. I use them to test if the cheat sheet comes with it, but I don’t use them to test if I have is an image with no background. Anti-forensic tools can wreck an OS, which will wreck your tool testing if you are unaware of what happened on that OS. Either your tool works but you think it doesn’t, or the tool doesn’t work when you think it does.
For these types of images, it’s mostly for fun to see what I can dig up. I don’t use them for training or testing, but I do use them for practice. Practice is not validation. Practice is just practice. Perhaps I run different tools that do the same but the output is different. I may not be interested so much in the data it pulls on one of these images as much as I am interested in learning how different tools work.
I strongly recommend creating your own images. But it takes time and detailed documentation.
If you work in a large shop, you might have the best opportunity to have a team effort create a set of images that can be used for years. With virtual machines, you can snapshot them different stages and go off on different tangents. You can hack them, flood them with malware, email between them, create various scenarios, use different operating system and different versions of the operating system, and eventually you will have a library of test images available at your fingertips.
For students, this is probably the best way to go. When you are learning this field, going through a terabyte image and fishing around isn’t going to be that helpful of a learning experience. You won’t know what you are looking for, or when you find it, or if you were right in your assumptions of what you found. Unless you have the supporting background of the image.
I am hoping that I am not the only person with a RAID that contains more virtual machines and images of each virtual machine that I can remember creating….
Now….if you are not sure if you can create a forensic image that is perfect for your forensic testing, trust that you can. Build a VM. Dirty it up. Write it up. Image it. Find the dirt. Yes, you know where the evidence is, but is your tool capable of pulling it out and displaying the results accurately? You need to know what is behind Door #3 when testing tools and learning.