Topics in this article

Ever wondered where your Windows Domain Controller (DC) stores the managed authentication data for users assigned within your network?

Well, since Windows 2000, this has been the job of the active directory (AD), which maintains a central database of information within the NT directory service. The file ntds.dit is an extensible storage engine (ESE), formerly JET-Blue, indexed and sequential access method (ISAM) database developed by Microsoft. It’s used by applications to store data and allow quick access to the content via the application programming interface (API).

The AD is often a first port of call for an analyst wanting to grasp an understanding of the environment.

The AD is often a first port of call for an analyst wanting to grasp an understanding of the environment.

The database maintains a backup of content and also uses transactional methods, which allows for the smooth retrieval of data. This file is often a first port of call for an analyst wanting to grasp an understanding of the environment. It’s often the case that the directory information tree (DIT) can be accessed via ESE viewer programs using the API. On the face of it, this seems fine and it allows an analyst to quickly access the content and retrieve information.

Understanding the structure of ESE databases

However, it’s important to understand how ESE databases work in the background to ensure that no stone is left unturned. This could be unexplored data from within transactional log files or deleted content within the live ESE database purged, due to API access. Let’s briefly touch on the structure.

In simple terms, an ESE database contains a set of tables, rows and columns. The schema is maintained within the MSysObjects catalogue table, present in any ESE database file. If you’re examining the structure of the file via a HEX editor from a raw data point of view, then we need to think pages. Each page within the ESE database has a fixed page size, located within the header. Pages are maintained by B+ trees, which are built up from the root page, and also a leaf page, which is where notable data resides.

Mainstream forensic software tools handle ESE databases differently. Some will try to access the ESE database alone, without worrying about the current state. Others will try to repair the file, potentially deleting corrupt data. Some will replay the required log file(s) and others will not touch them.

Mainstream forensic software tools handle ESE databases differently.

Mainstream forensic software tools handle ESE databases differently

This is potentially damaging for automated parsing tools, as there may be misconception by an analyst and uncertainty around the examination. It’s therefore vital to know how your tool of choice handles the file. When a new operation is initiated, data is first written to log files and cached in memory. After this, modified pages are monitored and sometime later written from the log files into the main ESE database.

This has many forensic implications, most notably identifying the state of the database upon a controlled or uncontrolled shutdown. It’s also clear that the transaction log file may contain historic data and any live memory may contain fragments of AD data. This is yet another example of highlighting the importance of capturing RAM prior to shutdown of the Windows system – but more on these points later.

By navigating the file system to \%SystemRoot%\NTDS\, you’ll locate the files associated to the AD. Following the extraction of the ESE database from the directory path, an analyst should explore the Microsoft esentutl utility shipped with Windows server and initiated via command prompt. By issuing the header command esentutl /mh ntds.dit, this will allow the analyst to dump the file header and identify the current database state.

Cleaning up the database

This will result in either the identification of a dirty or clean database. If the database is dirty, then the system did not correctly terminate, therefore the ESE is not up to date as data is residing within the log files, yet to be appended to the active ESE database. However, if the database is clean, the ESE database is at a consistent/up to date state.

Should a dirty database be found, we can again use esentutl to commit the pending transactions from the required log file, resulting in a clean ESE database, simply by issuing the recovery command esentutl /r . An analyst would then be able to view and examine the file ntds.dit in its most up to date state.

  Esentutl database recovery

Esentutl database recovery

This is great. We now have file containing up to date AD information. This allows us to use scripts or programs capable of parsing notable data identified relating to the active users, groups, computer information and security credentials. If you need to know when a user last logged in, this is where we can find that information. It should also be noted the NT and LM hashes for the user's password are also stored here, which is why it’s interesting to threat actors and susceptible to frequent attacks, especially by the infamous Mimikatz.

When it comes to the parsing and subsequent analysis of AD data, there’s been a lot of work done over the years. Some prominent open source toolkits, which are frequently used for the interaction, deconstruction and analysis of ESE databases and AD data include:

  • Libesedb - Python module.
  • DSInternals - PowerShell module.
  • NTDSXtract - Python scripts.

PowerShell can be very effective to quickly examine the AD. In this instance, DSInternals PowerShell module has been utilized to firstly retrieve the BootKey from the SYSTEM hive allowing decryption of the AD data and then secondly to query the AD for account specific information associated to the SamAccountName '912594'. This includes, but isn’t limited to, general users' details, last logon, NT and LM hashes for the users' passwords and so on.

PowerShell can be very effective to quickly examine the AD.

Identifying what’s running on a live system

Memory data often is the first port of call, in order to identify what is running live on a system. Most malware nowadays is fileless, therefore capturing the memory is vital in order to identify malicious processes running, but memory isn’t only limited to malware usage. ESE databases use memory extensively to store data. 

Let’s look at an example, away from AD for a moment, but the same principles apply. If an endpoint is running and the user is using internet explorer (which stores internet artefacts with WebCacheV01.dat) should this computer be subject to an unexpected shutdown, the ESE database may not have had time to update and carry out transactions of data still residing in memory. This means the most recent internet history may not be recorded. Pretty damming and, from a forensic perspective, evidence is lost! 

Let’s play devil’s advocate for a moment and say a threat actor gained access to your Windows server and deleted some users from the AD. While an analyst may look to other artefacts to confirm this, we could still do some more digging to find deleted content. This is  where the initial strategy needs to be determined, before completing this work. 

By returning an ESE database to a clean state, we’re potentially writing data onto pages where historic and deleted data used to reside. Instead, we could first try to carve from these pages, before carrying out a recovery using esentutl. Better yet, let’s also carve from all of the log files which may still reside within the directory path, as these could contain historic and deleted data. 

Why not go one further and dive into the memory data and see what we can find? The golden nugget may well be there waiting to be parsed out, yet to be written into the main ESE database. 

While this blog post may have been focused on AD,  Microsoft has implemented ESE methodology into the exchange server, Windows search, Windows phone and numerous other applications in development, therefore the same principles for investigation of AD data, can be applied in other disciplines.