Last updated Apr 1, 2004.
Autopsy / Sleuthkit
It is not possible to talk about forensics and Linux without discussing Autopsy and The Sleuthkit. By far, these two programs represent the most intuitive and user friendly set of tools available for the Linux platform. In fact, this dynamic duo out performs many of the forensics tools available on Windows. Oh, and did I mention that these programs are also free? In all honesty, I could write a book about Autopsy and The Sleuthkit and how it can help dig deep inside a data image to locate and extract relevant information, but that is way beyond the scope of this section. However, I will provide a general overview of this program, and how I used it to dissect a floppy purchased on Ebay.
The Sleuthkit (TSK) is a "...is a collection of UNIX-based command line file system and media management forensic analysis tools." It was previously known as TASK, and "...uses some code and design from The Coroner's Toolkit (TCT)", which is command line forensics suite used to study computers after a break-in. While TSK is alone valuable enough to warrant a discussion, command line tools are often esoteric and difficult to use, especially for non-Unix types. Fortunately, Autopsy takes this text based suite and provides the much needed graphical interface through which a user can quickly and easily interact with TSK. In addition, Autopsy also adds some functionality by automating some of the management and searching functions that are required for any investigation.
The first step is to download, unpack, and compile the code. To do this, download Autopsy and TSK from http://www.thesleuthkit.org and unpack them into their respective directories. You will need to compile and setup TSK before attempting to install Autopsy, so start with TSK and execute "make" in the sleuthkit directory. Once the files are compiled, jump to the Autopsy folder and "configure" then "make" the program. You will need to point Autopsy to the newly compiled TSK during the configuration procedure.
Once installed, you need to execute the program. This is generally accomplished via command line, which will return a URL to plug into your browser (see figure Autopsy-1), which loads a web application (figure Autopsy-2) that allows you to control and manage the case data. In this illustration situation, I am working on a series of floppies with unknown data, thus I have called my case "Floppys" as illustrated in Autopsy-3.
Autopsy-1: Starting up Autopsy
Autopsy-2: Creating a new case
Autopsy-3: Case Gallery
In a typical situation, a case would contain a host with several drives that can be searched (i.e. c-drive, d-drive, cd roms). In our case I am treating each floppy like its own host, and then will add one image per host that contains a symlink to the actual floppy. Figure Autopsy-4 illustrates the settings I used to define my host. Note the Timezone entry ("-300"). This is the standard for GMT -5, which equates to -300 minutes. You will have to adjust your own timezone entry according to your location. You can also define a database of known and unknown hash entries that can help speed up file extraction time, especially if you are working with a standard operating system that has thousands of dll files that have a MD5 signature hash. Due to the intrinsic characteristics of an MD5 hash, it is easy to create a MD5 hash for each file and filter it according to the database entries.
Autopsy-4: Adding a new host.
Once the host is added, you need to mount the image/disk/media that is going to be analyzed. This is when you start to get into options, and is why I selected to demonstrate Autopsy/TSK with a floppy. As figure Autopsy-5 illustrates, there are several settings that need to be defined when adding an image. You first need to define the physical location as determined by the OS you are working on. In my case, I am viewing a floppy, which is typically at "/dev/fd0". Next you select how to examine the data, which can either be linked to autopsy via a symlink (ln command) or copied/moved to the local file system. I choose to symlink this file because I don't want hundreds of floppy images on my computer.
Autopsy-5: Adding a new image
Next you need to select the file system of the image. If you are dealing with an unknown file system, you can select 'raw', which treats the image as pure data and disregards formatting options. You then enter the mount point of where the image can be mounted, and select your Image Integrity Options. If this is a serious case, you will want to generate an MD5 hash to ensure that file integrity is maintained during the analysis. Once complete, hit the 'Add Image' button. If everything is entered correctly, you will be rewarded with a complete screen. At this point it is time to start investigating using the menu options as illustrated on figure Autopsy-6.
Autopsy-6: Image Gallery options
At the bottom of the page there are five menu options. Each of these is a separate program that performs a distinct function not directly related to analyzing the data in the file. The options and their purpose are as follows:
File Activity Time Lines: Useful for generating a timeline of 'live' files, which can help track a hackers path through the computer system and can result in hints that can be useful for the rest of the investigation.
Image Integrity: Recalculates an MD5 hash of the original image file and compares it to the one created during the setup.
Hash Databases: Allows maintenance for hash databases that hold MD5 hashes of known and unknown files.
View Notes: A simple method for keeping notes with in Autopsy.
Event Sequencer: A method to track events with in Autopsy.
To start researching the image, click the OK button. This will take you to a new page with a list of options across the top menu bar that allow you to drill down to specific data, files, or sectors on the hard drive.
In our case, most of the menu options are not available because the disk is not in any recognizable format. In other words, this disk has a damaged or wiped Master Boot Record. Windows would not recognize it and Linux would see it as blank. However, this doesn't mean it is clean of all data. If you note in figure Autopsy-7, there are two options available. The first is "Keyword Search" and the second is Data Unit. Therefore, our first step is to test the floppy for known or probably strings that would indicate valuable data. To do this, we simple need to extract the strings using the "Extract String" option. While we could jump right in and started searching the disk for possible string values, this approach will save time because it will create an index of the text strings located on the image that can be quickly searched using the Search option.
Autopsy-7: Keyword Search
In the case of our floppy, we were able to locate a rather extensive list of strings. At first glance, it would appear as if our floppy contained a database file of some sort. To locate specific values and their location on the disk, you simply have to enter a value and hit search. Autopsy will keep track of searches and the corresponding results with sector number that can be viewed with a click. In addition to basic searches, you can use more complex grep based searches to analyze string data. Autopsy includes two more complex searches "IP" and "Date" to help speed up your researching.
In the next section we will look at the data available to us on the floppy and how to extract it to recreate a file.
Data Unit
As per the Help section in Autopsy, "the Data Analysis Mode allows an investigator to view the contents of an individual data unit." Depending on the file system, the unit could be a fragment, cluster, or in the cast of FAT, a sector. Regardless, this is the part of the program that lets you get into the raw data on the media.
In our example, a unit is a 512 byte chunk of data, which is the same size as a sector on the floppy. Since our floppy has two sides, each with 1440 sectors (18 sectors x 80 tracks per side), then we know that our floppy has 2880 data units. Knowing this, we can easily hop around inside on our floppy to locate a specific chunk of data by simply entering the desired unit number in the appropriate box (see figure Autopsy-9). In addition, we can also view more than one sector by entering a value in the Number of Units field. Finally, the Lazarus address is useful if you are resurrecting files from deletion (thus the name Lazarus who is a biblical figure that was raised from the dead). However, in our case the floppy is simply raw data with no file system from which to extract a file, so the Lazarus option is useless.
To get a quick overview of the data on the floppy, we enter a value of 0 in the Unit Number field and 2880 in the Number of Units field. This will extract the entire data image and display it on the screen in ASCII format. As you can see from image Autopsy-9, the disk is not empty and contains a list of addresses and phone numbers and some bonus HTML data at the end!
Figure Autopsy-8: Viewing the floppy via data unit p-program
Extracting and recreation
This part of the process is more challenging because there is no simple one step fits all approach to recreating a file. In fact, you may not be able to create a file depending on the status of the media, in which case the raw data will be your evidence, or at least assist in providing clues to help with other parts of the investigation. In our case, we are dealing with a fairly straight forward listing of names and addresses (which is another reason I selected this floppy). To export the data, we can click on the [Export Contents] button at the top of the window and create a raw data dump of the image on the screen.
Once this data is extracted, it can be loaded in your favorite hex editor and massaged to remove the extra irrelevant information. We used khexedit to quickly analyze and remove all extra information from the raw dump that did not have anything to do with the apparent list. Once this was complete, I opened up our OpenOffice Spreadsheet program and setup a data source link to the newly created text file. Much to my surprise, the sample.txt file was accepted as a valid source of data. From figure Autopsy-10, you can see that my list is nothing more than a complete listing of high tech companies, including names and phone numbers of company executives, and other information that could be used by marketing firms to locate the 'right' person. Once I realized what this information was, I went back to the image file found a reference to "Rich's", which eventually led me to http://www.norcalcompanies.com/. According to their web site, Rich's has "...complete information on over 9,000 high technology firms in Silicon Valley with over 40,000 contact names at those firms", all for the low cost of $249.50 a year. I got my copy for about 5 cents.
Figure Autopsy-9: Extracted Rich's Database
As you can see, even on a small floppy it is possible to find valuable information. Given that this one floppy was included in a package of 50, of which there are several more similar to the one just dissected, there is a good chance that I have a complete database in my hands.
To illustrate the usefulness of some of the other parts of Autopsy, we are also going to take a look at another disk that proved to be an interesting find. Again, this is a real floppy that I purchased on Ebay for the sole purpose of this article.
Round 2
Since we already have a case open, we only needed to create a host in which we will contain the new floppy's information using the same general procedure as previously described. Once the host was added, we added the floppy image, except this time when we attempted to define the file system as 'fat' it was accepted as a valid selection.
After the image was added, we were then able to take a closer look at the data using the menu options not available while viewing a raw image. The following takes a look at these 'programs' and their functions.
File Analysis: Gives us a look at the image from the file/folder level. While much like the standard interface used to view file information, it is possible to use the File Analysis to look at the raw data in addition to the file as a whole. As image Autopsy-10 illustrates, this view can be error prone, but still provide a valuable overview of the data available.
Figure Autopsy-10: File Analysis of Floppy5 image
File Type: One of the more valuable tools in this program, it allows the user to quickly extract and sort detected files from the image. This helps the investigation process by filtering out unnecessary files and allows the investigator to focus on files that are more interesting. In our case, the File Type sorter quickly extracted several archives and an AVI file. Image Autopsy-11 shows the selection screen that a user can use to help control the output.
Figure Autopsy-11: File Type options screen
Image Detail: This program provides an abstract but detailed view of the image as a whole. With it you can learn about file system type, disk properties, and more, as illustrate in figure Autopsy-12.
Figure Autopsy-12: Image Details for Floppy5
Meta Data: Another method of learning information about a file. The Meta Data contains file creation, size, modification, etc. information that can be useful in putting together a timeline of activity and also learning if the file was altered in anyway to hide valuable information. Using the Allocation List button, you can quickly obtain an index of files, from which you can select a number to view more information about the file associated with that entry. As figure Autopsy-13 illustrates, it is easy to learn quite a bit of detail about a file.
Figure Autopsy-13: Meta Data view Floppy5 image.
As you can see, there is a lot of information available from just a simply floppy. In fact, on this second sample, I was able to extract ten files, nine of which were zip files that contained everything from simple games to warez'd Southpark screen hacks. Given that this is a simple 1.4MB floppy, you can imagine the files available on a used 10+ Gig hard drive!
Autopsy and TSK are an excellent example of what open source can provide for the average person. It is quite possible to use these tools to investigate and extract valuable information from most any type of file system, and all for the low price of your time. While the Windows based tools might be easier to use in some cases, you will find using TSK/Autopsy to examine data invaluable as a tool to help you understand the more intricate details of data storage, as much as an investigation tool.