Abhijit Kulkarni and Prakash Jagdale discuss why most real-time anti-virus scanners are ineffective at detecting malware written using the TxF facility and propose a working solution for the problem.
Copyright © 2010 Virus Bulletin
Transactional NTFS (TxF) integrates transactions into the NTFS file system so that the file operations enjoy the ACID properties (Atomicity, Consistency, Isolation and Durability) of transactions. TxF improves application reliability and data consistency and guarantees consistency in the event of system failure. Given the benefits of TxF and its presence in all the latest versions of the Windows operating system (starting from Windows Vista), its usage is only likely to increase in coming years.
This article discusses why most real-time anti-virus scanners are ineffective at detecting malware written using the TxF facility and proposes a working solution for the problem.
Concepts associated with TxF – such as the Kernel Transaction Manager (KTM), Distributed Transaction Coordinator (DTC), Transaction Manager (TM), Resource Manager (RM), Secondary Resource Manager, deployment scenarios, performance considerations and internals of TxF – are beyond the scope of this article and hence not discussed. Moreover, only relevant TxF APIs that are available in user mode and kernel mode are discussed.
TxF is a component of Windows Vista and later operating systems and allows for files and directories to be modified, created, renamed and deleted atomically. TxF is implemented on top of a kernel component called Kernel Transaction Manager (KTM), which provides transactions of objects in the kernel. TxF allows file operations on an NTFS file system volume to be performed in a transaction.
TxF improves error recovery and reliability. It can simplify an application’s error-handling code and improve performance. Consider an example where an application is saving a file. If the application/machine were to crash while writing the file, then only part of the file could be written, possibly resulting in a corrupted file. This would be a very significant problem if a previous version of the file was being overwritten, as data would likely be lost. In a traditional (non-TxF) application a lot of error-handling code would have to be written to handle all the failure cases, thereby increasing the application’s complexity. Let’s look at more scenarios where we can use TxF.
The updating of a file is a common and typically simple operation. However, if the system or application fails while an application is updating information on a disk, the result can be catastrophic, because the user data can be corrupted by a file update operation that is partially completed. Robust applications often perform complex sequences of file copies and file renames to ensure that data is not corrupted if a system fails. TxF makes it simple for an application to protect file update operations from system or application failure. To update a file safely, the application opens it in transacted mode, makes the necessary updates, and then commits the transaction. If the system or application fails during the file update, then TxF automatically restores the file to the state it was in before the file update began, thus avoiding file corruption. TxF is even more important when a single logical operation affects multiple files. For example, if one wants to use a tool to rename one of the HTML or ASP pages on a website, a well-designed tool also fixes all links throughout the site to use the new file name. However, a failure during this operation would leave the website in an inconsistent state, with some of the links still referring to the old file name. By making the file-rename operation and the link-fixing operation a single transaction, TxF ensures that the two actions succeed or fail as a single operation.
TxF isolates concurrent transactions. If an application opens a file for a transactional read while another application has the same file open for a transactional update, TxF isolates the effects of the two transactions from one another. In other words, the transactional reader always views a single, consistent version of the file, even while that file is in the process of being updated by another transaction. An application can use this functionality to allow customers to view files while other customers make updates. For example, a transactional web server can provide a single, consistent view of files while another tool updates those files.
TxF helps improve application and platform stability and increase innovation. Let’s see how.
TxF improves application stability by reducing or eliminating the amount of error-handling code that needs to be written and maintained for a given application. This ultimately reduces application complexity and makes the application easier to test. Say, for instance, you are developing a document management system where an SQL data source needs to be kept consistent with a file store on disk. Ensuring this consistency can be tricky and non-trivial in a non-transactional system. Without transactional file operations, it would be nearly impossible to account for every possible failure scenario, up to and including the operating system crashing at any imaginable point during the process. One of the ways in which this was handled in the past was by storing the new version of the file with a temporary file name, writing the new data to the SQL database, and then renaming the temporary file to the real file name when committing the SQL transaction. But consider what happens if the application crashes or if there is a power outage right after committing to the SQL database but before the file is renamed. Not only would you have an inconsistent data set, but you would also have an artefact on the file system that you would have to clean up at some point. As usual, the extremely difficult part lies in the details of how many different ways the process can fail.
Rather than having to be implemented by each developer, TxF incorporates transaction management capabilities directly into the platform. And since TxF is embedded in the system itself, it is capable of providing a level of integration that would not otherwise be possible for applications.
Using the same example, how can you ensure consistency within a document management application with TxF? This is where the DTC is helpful. To absolutely ensure consistency between your SQL database and your file store, you can start a transaction, perform your SQL statements and file operations within that same transaction, and then commit or rollback the complete transaction based on the outcome. If your SQL call fails, the file will never be written. If your file system call fails, your SQL is rolled back. Everything remains consistent, and all of this is handled automatically by the platform since the operations are enlisted within a transaction. The result of this is less code, which makes the application more robust.
As for platform stability, Microsoft has used TxF in its own technologies, e.g. Windows Update, System Restore, etc., using TxF to write files to the file system within the scope of a transaction in order to handle rollback/commit in case of any exceptions, such as a system reboot due to a loss of power. By adopting TxF internally, Microsoft has helped make its own operating system more stable.
Finally, TxF drives innovation by providing a framework for using transactions outside of SQL calls. Ultimately, TxF can fundamentally change the way developers write applications, allowing them to build more robust code. By incorporating transactions into your design, you can write code without having to account for every single possible failure that can occur. The operating system will take care of those mundane details.
Let’s look at the fundamentals of TxF that will help us to understand the issue explored in the article hereafter.
A ‘transacted writer’ refers to a transacted file handle opened with any permission that is not part of generic read access but is part of generic write access. A transacted writer views the most recent version of a file that includes all of the changes by the same transaction. There can be only one transacted writer per file. Non-transacted writers are always blocked by a transacted writer, even if the file is opened with shared-write permissions.
A ‘transacted reader’ refers to a transacted file handle opened with any permission that is a part of generic read access but is not part of generic write access. A transacted reader views a committed version of the file that existed at the time the file handle was opened. The transacted reader is isolated from the effects of transacted writers. This provides a consistent view of the file only for the life of the file handle and blocks non-transacted writers.
Note that when a handle has been opened for modification with the CreateFileTransacted function, all subsequent opens of the file within that transaction – whether read-only or not – are converted by the system to be a transacted writer for the purposes of isolation and other transactional semantics. This means that subsequently, when a handle is opened for read-only access, the handle does not receive a view of the file prior to the start of the transaction; it receives the active transaction view of the file.
A non-transacted file handle does not see any of the changes made within a transaction until the transaction is committed. The non-transacted file handle receives an isolated view that is similar to a transacted reader, but unlike a transacted reader, it receives the file update when a transacted writer commits the transaction.
TxF provides read-committed isolation. This means that file updates are not seen outside the transaction. In addition, if a file is opened more than once while reading files within the transaction, you may see different results with each subsequent opening. Files that were available the first time you accessed them may not later be available (because they have been deleted), or vice versa.
Creating a transacted writer on a file locks the file transactionally. After a file is locked by a transaction, other file system operations external to the locking transaction that try to modify the locked file will fail with either ERROR_SHARING_VIOLATION or ERROR_TRANSACTIONAL_CONFLICT. Table 1 summarizes transactional locking.
|File currently opened by||File open attempted by|
|1Fails with ERROR_TRANSACTIONAL_CONFLICT|
|2Fails with ERROR_SHARING_VIOLATION|
Table 1. Transactional locking.
The following series of steps represents the most basic use of TxF. More complex scenarios are also supported, at the discretion of the application designer.
Create a transaction by calling the KTM function CreateTransaction or by using the IKernelTransaction interface of the Distributed Transaction Coordinator (DTC).
Get transacted file handle(s) by calling CreateFileTransacted.
Modify the file(s) as necessary using the transacted file handle(s).
Close all transacted file handles associated with the transaction created in step 1.
Commit or abort the transaction by calling the corresponding KTM or DTC function.
The TxF programming model has the following key points to consider when you develop a TxF application:
It is highly recommended that an application close all transacted file handles before committing or rolling back a transaction. The system invalidates all transacted handles when a transaction ends. Any operation (except close) performed on a transacted handle after the transaction ends returns the following error: ERROR_HANDLE_NO_LONGER_VALID.
A file is viewed as a unit of storage. Partial updates and complete file overwrites are supported. Multiple transactions cannot modify the same file concurrently.
Memory mapped I/O is transparent and consistent with the regular file I/O. An application must flush and close an opened section before committing a transaction. Failure to do this can result in partial changes to the mapped file within the transaction. A rollback will fail if this is not done.
A real-time anti-virus scanner on Windows typically has a kernel-mode component which is a file system filter driver. If based on the old file system filter model, the filter is called a legacy filter, and if it is based on Microsoft’s new Filter Manager model it is known as a minifilter. Though both the models are supported on Windows Vista, the legacy filter style is not supported on Windows 7. Hence this article will only discuss minifilters.
There is no generic design for implementing the real-time scanner but a typical implementation scans the file when it is closed. The following is the typical way in which a real-time scanner is implemented:
File c:\target.exe is closed.
Minifilter is called at its file close callback for the file c:\target.exe.
Minifilter scans the file c:\target.exe (with the help of a user-mode application).
With this knowledge and the TxF concepts that we have already discussed, let’s see how malware can exploit this. Figure 1 illustrates the problem.
Let’s describe the steps shown in Figure 1:
Malware.exe creates a transaction by calling CreateTransaction.
Malware.exe opens a transacted file handle for the file c:\target.exe by calling CreateFileTransacted.
Malware.exe writes malicious code in the file c:\target.exe by calling WriteFile.
The malware closes the transacted file handle to the file c:\target.exe by calling CloseHandle.
The real-time scanner is called at its file close callback for the file c:\target.exe.
The real-time scanner scans the file c:\target.exe.
The malware commits the transaction using CommitTransaction.
The malware closes the transaction using CloseHandle (on the transaction handle obtained in Step 1).
Consider Step 6. In this case the real-time scanner will be a non-transacted reader and hence gets an isolated view of the file c:\target.exe. Hence it will end up reading the file contents which were present before the transaction started. The result is that the real-time scanner will fail to detect the malware written in c:\target.exe.
In the above case the real-time scanner could not see the changes made within the transaction since the scanner was outside the transaction. Neither did it scan the file when the transaction was committed. To solve this issue the file must be scanned when the transaction is committed.
One way to achieve this is for the real-time scanner to be notified when the transaction is committed. This can be achieved by using the FltEnlistInTransaction API provided by the Filter Manager. The FltEnlistInTransaction routine enlists a minifilter driver in a given transaction. The minifilter must call this API with an appropriate mask, usually FLT_MAX_TRANSACTION_NOTIFICATIONS. A minifilter driver that is enlisted in a transaction will receive a TRANSACTION_NOTIFY_COMMIT_FINALIZE notification when the transaction is fully committed. A minifilter that performs scans outside of transactions can use this notification value to determine when to begin scanning files.
To send the TRANSACTION_NOTIFY_COMMIT_FINALIZE notification to the minifilter driver, the Filter Manager calls the minifilter driver’s TransactionNotificationCallback routine which is registered using the FLT_REGISTRATION structure during the call to FltRegisterFilter during minifilter initialization.
In the TransactionNotificationCallback routine the minifilter driver acknowledges this notification in one of two ways:
The minifilter driver’s TransactionNotificationCallback routine performs any necessary processing (typically scanning the files) and returns STATUS_SUCCESS. (In this case, the minifilter driver does not call FltCommitFinalizeComplete.)
The minifilter driver’s TransactionNotificationCallback routine posts any necessary processing to a worker thread and returns STATUS_PENDING. After performing the processing asynchronously, the minifilter driver’s worker thread routine must call FltCommitFinalizeComplete to indicate that it has finished this processing. If the minifilter driver’s worker thread routine does not call FltCommitFinalizeComplete, certain system resources will be leaked.
So there are a minimum of three code integration points in the minifilter which attract changes: the minifilter’s DriverEntry routine, the Post-Create routine and the TransactionNotificationCallback routine.
We have discussed the basic concepts of TxF and its advantages. We have also outlined a loophole in which malware written using TxF can go undetected by real-time anti-malware scanners, and we have proposed changes that can be made to the kernel-mode component (minifilter) of a real-time scanner so as to close that loophole.
 Microsoft Developer Network. Transactional NTFS (TxF). http://msdn.microsoft.com/en-us/library/bb968806(VS.85).aspx.
 Windows Driver Kit. FltXxx (Minifilter Driver) Routines. http://msdn.microsoft.com/en-us/library/aa488414.aspx.
 Olson, J. Enhance Your Apps With File System Transactions. MSDN Magazine, July 2007. http://msdn.microsoft.com/en-us/magazine/cc163388.aspx.