An interesting Twitter thread popped up on forensic imaging. Good points were made on whether or not to create full disk images, sparse images, or even to image at all.
There are so many factors to consider in such a decision, that I believe it unreasonable to have a simple catch-all solution. Civil vs criminal case. Legal authority. Resources and time available. Type of case. Type of system. Amount of data. Number of systems. And other unforeseen situations.
But I believe that quite simply, if legal authority exists, and resources are available, why not create full disk images? To clarify, "available resources" means that you have the time, the tools, the staff, the storage, the funding, and the capability to do it. You can obviously choose not to image or create full disk images even though you have available resources, but if you can, why not? It's better to grab as much (all) that you can and filter out the garbage later, than it is to hope you got what you needed on a sparse collection.
I understand that some cases may involve hundreds or even thousands of machines. I understand that some DF/IR organizations (public or private) may not have the physical resources to do such massive data collections. I also understand that some cases have priority over other cases.
However, I believe that my take on imaging is solid as it rests upon available time and resources. If a DF unit has the capability to image a thousand computer drives and do the analysis, then why wouldn’t they if they have legal cause and authority to do so? This is an extreme example, but the math works out. If you have the time and resources to do a complete job, then there is no reason to not do a complete job.
On to triage.
Triage is great for two big reasons:
#1 – is there evidence that I can find right now that justifies seizing this system, and/or
#2 – how important is this system in the grand scheme of priority examinations.
Triage is not a replacement for imaging. It is to give you guidance on priority of analysis and if you need to collect it at all. Even then, triage only gives you guidance to make a decision, since triage may not reach into an area of evidence that you needed to make a fully informed decision. It’s best guesswork, but works pretty well in prioritizing cases.
In my experience, I have examined images from years prior to find evidence that was unknown at the time of seizure, and was overlooked by prior examinations. Not due to skill level, but due to new information coming to light later in the cases. I think this holds true across the board. You don’t know what you don’t know, so if you can image it all, why not? Creating full disk images does not mean you must do a complete forensic exam, but you have the option if you need. Incomplete images means that you can will never be able to do a complete forensic exam to find either inculpatory or exculpatory evidence. I have not heard of the defense yet, but I will not be surprised to hear about a case where the defendant swears that there was exculpatory evidence on the drive that was not completely imaged, and the original system no longer exists.
I would dread being asked in court, "You said you had the opportunity to create a full disk image. You had the resources. You had the time. Yet you didn't. Why didn't you image the complete drive?"
Back to the time and resources
Given that you have time , and you have the resources and legal authority to image everything, I personally believe that you should. You can still triage during the imaging process or even triage afterward. You don't need to do a complete exam on everything, but you can if you need.
And I understand reasons why not to image or only take sparse data, simply because that is not the investigative model for the specific task at hand, like when you have 20,000 nodes and 100,000 virtual machines, and “I have no idea how many physical machines we have’ and ‘I just need pst files’ sort of scenarios.
But I am talking about exploitation, missing persons, homicide, and some civil case matters. If you can seize it all, why not? I would compare this to serving a search warrant on a large house. You could “triage the house” by walking through it quickly while looking for evidence on the kitchen table and living room, but not by looking in the dresser drawers or under the bed. Or, you can go through everything. Or do both. Certainly, if the search warrant is a serious case, simply walking through the house isn’t going to cut it. You need to throw on some gloves and start digging through everything.
I am also aware of great research being done in the area of "sifting data", "sparse collections", "targeted collections" and so forth. Each of these are "incomplete" collections no matter how you look at it, but surely has its place. One paper that I read states:
"In general, only a small portion of the data on a disk has any relevance or impact on forensic analysis. The vast majority of sectors and files contain data irrelevant to most investigations; in fact, many sectors are either blank or contain data that is found verbatim on numerous other systems (e.g., operating system and application components). Fig. 1 depicts various categories of data present on a typical disk. For some investigations, executable files may be of interest. For others, browser artifacts are of primary interest. Blank space is virtually never of use. The rest of the data, beyond what is deemed relevant to a case, and which constitutes the vast majority of the collection, could actually be replaced by random noise without affecting the forensic analysis." - Rapid forensic imaging of large disks with sifting collectors
The problem with this theory of capturing the high value data is that you don't know which is the high value data. I've never returned to a house after a search warrant and asked, "May I search your home again? I neglected to check your basement the first time."
So….if you have the time, the resources, and the legal authority to create complete images, why not?
A few more points..(thanks PM!) is that if you can seize the entire media, you can always go back to it later at some point if needed without creating an image at all (put it in evidence, pull it out when ready to image/examine). In civil litigation cases, you don’t usually have that luxury. Most times it is (1) arrive onsite, (2) collect data, and (3) leave without the original media. By the time there is a concern to go back to the original media, the media either no longer exists or has been modified to the extent of being irrelevant. Ironically, civil case matters many times only allow for targeted data collection.
My point is that ”if” you can, you “should”. With the “if” being time & resources available and legal authority.