Saturday, May 19, 2012

NIDS need an Interface for Object Analysis

The information security community needs NIDS projects to provide an interface for analysis of objects transferred through the network. Preferably, this interface would be standardized across NIDS. Here I will provide my rationale for this assertion and some opinions on what that should look like.

Why

Security practitioners have benefited from project/vendor agnostic interfaces such as Milter, ICAP, etc in their respective realms. These interfaces provide the operator with compatibility, and therefore, flexibility. In this paradigm, the operator can choose, for example, a single mail server and then independently choose one or more mail filters. Decoupling the two allows the operator to focus selection of each product on the how well they do their primary jobs, mail transfer and mail filtering in our example. This is a classic example of the Unix philosophy of doing one thing well.
The NIDS world needs the same thing. We need a standardized interface that allows NIDS to focus on analyzing network traffic and allows a quasi-independent client object analysis system to focus on analyzing objects transferred through the network.
In the past it has been hard for me to understand why NIDS haven’t followed the huge shift up the stack from exploits that occur at the network layer (ex. Sasser worm or WuFTD exploits) to exploits that occur in client objects ( ex. PDF exploits or Mac Flashback Trojan). I’ve heard many people say NIDS are for detecting network exploits and if you want to detect client exploits, use something on the client. I would note that this mentality has been an excuse for some NIDS to not do full protocol decoding (ex. base64 decoding of MIME, gzip decoding of HTTP, etc). This has left network defenders with the motive (detect badness as soon as possible, preferably before it gets to end hosts), the opportunity (NIDS see the packets bearing badness into network), but without the means (NIDS blind to many forms of client attacks).
To be fair to the various NIDS projects, every relevant NIDS is taking steps to support some amount of client payload analysis. Actually, it seems like it is a relatively high priority for many of them. I know the Bro guys are diligently working on a file analysis framework for Bro 2.1. Furthermore, there are some very valid reasons why it’s in the security community’s best interest to abstract NIDS from detecting client object malfeasance, directly that is. First, back to the Unix philosophy mentioned above, we want our NIDS to be good at network protocol analysis. I can think of plenty of room for improvement in NIDS while remaining focused on network protocol analysis. Second, and central to the argument made here, we don’t want to fragment our client payload analysis just because of differences in where files are found (ex. network vs. host). To illustrate this last point, think about AV. Imagine if you had to have a separate AV engine, signature language, etc just because of where the object came from even though you are looking at the exact same objects. This is again, is one of the main reasons Milter, ICAP, etc exist. I cite VRT’s Razorback as a great example of this. From the beginning it supported inputs from various sources so the analysis performed could be uniform, regardless of original location of the objects. Lastly, one could argue that network protocol analysis and client object analysis are different enough that they require frameworks that look different because they tackle different problems. I’ve built both NIDS and file analysis systems. While they do similar things, their goals, and therefore engineering priorities are largely different. Anecdotally, I see NIDS focusing on reassembly and object analyzers focusing on decoding, NIDS caring about timing of events while file analyzers largely consider an object the same independent of time, etc.
What I’m arguing is that NIDS should be separated or abstracted from client file analysis frameworks. This certainly doesn’t mean a NIDS vendor/project can’t also have a complementary file analysis framework. Again, note Razorback and association with Snort. What I would like to see, however, is some sort of standard interface used between the NIDS and the file analysis framework. Why can’t I use Razorback with Bro or Suricata? Sure, I could with a fair amount of coding, but it could and should be easier than that. Instead of hacking the NIDS itself, I should be able to write some glue code that would be roughly equivalent to the glue required to create a milter or ICAP for the file analysis service. Furthermore, this interface needs to move beyond the status quo (ex. dump all files in directory) to enable key features such as scan results/object metadata being provided back to the NIDS engine.
Lastly, the existence of a standard interface would make it easier for users to demand client object analysis functionality for things flowing through the network while making it easier for NIDS developers to deliver this capability.
The security community needs a cross-NIDS standard for providing client objects to an external analysis framework.

What

The following are my ideas on what this interface between a NIDS and a file analysis framework should look like.

Division of Labor

It is probably best to start by defining more clearly what should go through this interface for embedded object analysis. When I say client objects I mean basically anything transferred through the network that makes sense as a file when divorced from the network. My opinion is that all things that are predominately transferred through the network are candidates for analysis external to the NIDS. Canonical examples of what I mean include PDF documents, HTML files, Zip files, etc. Anything that is predominately considered overhead used to transfer data through the network should be analyzed by the NIDS. Canonical examples include packet headers, HTTP protocol data, etc.
Unfortunately, the line gets a little blurry when you consider things like emails, certificates, and long text blobs like directory listings. I don’t see this ambiguity as a major issue. In an ideal world, the NIDS framework would support external analysis of objects, even it can do some analysis internally also.
I will discuss in greater detail later, but the NIDS also has the responsibility for any network layer defragmentation, decoding, etc.

An API, not a Protocol

Fundamentally, I see this as an API backed by a library, not a network protocol (like ICAP, Milter, etc). In the vast majority of cases, I see this library providing communication between a NIDS and a client analysis on a single system with mechanisms that support low latency and high performance. In many cases, I expect the file analysis may well go and do other things with the object, such as transfer it over the network or write it do disk. The type of object analysis frameworks I’m envisioning would likely have a small stub that would talk to the NIDS interface API and take to the object analysis framework submission system. Using Razorback’s parlance, a relatively small “Collector” would use this API to get objects from the NIDS and send them into the framework for analysis.
I see the various components looking roughly as follows:

+------+  objects and metadata   +------+  /------>
|      | ---> +-----------+ ---> | File | /  Object
| NIDS |      | Interface |      | Anal\| -------->
|      | <--- +-----------+ <--- | ysis | \ Routing
+------+   response (optional)   +------+  \------>

From the NIDS to the Object Framework

The primary thing the NIDS has to do is give an object to the external framework. This is fundamentally going to be a stream of bytes. It is most easily thought of as a file. The biggest question to me is who owns the memory and is there is a buffer copy involved? My opinion is that to make performance reasonable you have to be able to do some sort of analysis in the framework without a buffer copy. You need to be able to run the interface, and at some simple object analyzers, on a buffer owned by the NIDS, probably running in the NIDS’ process spaces. Obviously, if you need to do extensive analysis, you’ll probably want it to be truly external. That’s fine, the framework can let you have the buffer and you can make a copy of it, etc
The other thing that the NIDS needs to provide to an external analyzer is the metadata associated with the object. At a minimum this needs to include enough to trace this back to the network transaction—ex. IPs, ports, timestamp. Ideally this would include much more--things that are important for understanding context of the object. For example, I can imagine uses for many of the HTTP headers, both request and response, being useful or even necessary for analysis. For example, the following are metadata items I’d like to see from HTTP for external analysis: Method, Resource, Response Code, Referer, Content-Type, Last-Modified, User-Agent, and Server.
Delving into details, I would think it prudent for the interface between the NIDS and the external framework to define the format for data to be transferred without specifying exactly what that data is. The interface should say something like use HTTP/SMTP style headers, YAML, or XML of this general format, and define some general standards without trying to define this exhaustively or make this too complicated. In most cases, it seems that identifiers can be a direct reflection of the underlying protocol.

From the Object Framework back to the NIDS

One important capability that this interface should enable is feedback from the external analysis framework back into the NIDS. I’d really like to see, and think it’s crucial for NIDS if they want to avoid marginalization, to not only pass objects out for analysis but also to accept data back. I see two important things coming back: a scan result and object metadata. In its most simple form the scan result is something like “bad” or “good” and could probably be represented with 1 bit. In reality, I can see a little more expressiveness being desired, like why the object is considered bad (signature name, etc). In addition to descriptions of badness or goodness, I see great value in the external framework being able to respond back with metadata from the object (not necessarily an indication of good or bad). This could be all sorts of relevant data including the document author, PE compile time, video resolution, etc. This data can be audited immediately through policies in your NIDS (or SIMS) and can be saved for network forensics.
Of course, this feedback from the external object analysis system needs to be optional. I can think of perfectly good reasons to pass objects out of the NIDS without expecting any feedback. Basically all NIDS to external analysis connections are a one way street today. However, as this paradigm advances, it will be important to have this bi-directional capability. I can also think of plenty of things that a NIDS could do if it received feedback from the external analyzer. Having this capability, even if it is optional, is critical to developing this paradigm.

Timing and Synchronization

An important consideration is the level of synchronization between the NIDS and the external object analyzer. I believe it would be possible for an interface to support a wide range from basically no synchronization to low latency synchronization suitable for IPS. Obviously, the fire and forget paradigm is easy and to support this paradigm nicely you could make feedback from the analyzer optional and make whatever scanning occurs asynchronous. However, an external analysis interface that expects immediate feedback can be easily be made asynchronous by simply copying the inputs, returning a generic response, and then proceeding to do analysis. On the other hand, morphing an asynchronous interface into a synchronous one can be difficult. For that reason, it would be good for this to be built in from the beginning, even if it is optional, to begin with.
Can this sort of interface be practical for IPS/inline or it destined to be a technology used only passively? My answer is an emphatic yes. I challenge NIDS to find a way to allow external analysis whose latency is adequately low enough to be reasonable for use inline. I’ll share my thoughts on payload fragmentation below, but even if objects can’t be analyzed until the NIDS has seen the whole object, the NIDS still has recourses if it gets an answer back from the external analyzer quick enough. It would also be nice if NIDS started to support actions based on relatively high latency responses from an external analysis framework even if the payload in question has already been transferred though the network. Options for actions include things like passive alerting, adding items to black lists (IP, URL, etc) that can be applied in real time, and addition of signatures for specific payloads or hashes of specific payloads. It seems inevitable that the interface between the NIDS and the external analyzer will include some sort of configurable timeout capability to protect the NIDS from blocking too long.

De-fragmentation and Decoding:

I envision a situation where the NIDS is responsible for network layer normalization and the file analysis framework does anything necessary for specific file formats. The NIDS is responsible for all network protocol decoding such as gzip and chunked encoding of HTTP. However, things like MIME decoding can be done internally to the NIDS or externally depending whether it is desired to pass out complete emails or individual attachments.
The NIDS should be responsible for any network layer reassembly/de-fragmentation required. Ex. IPfrag, TCP, etc. I think some exceptions to this are reasonable. First of all, it makes a lot of sense for the interface to allow payloads to be passed externally in sequential pieces, with the responsibility of the NIDS to do all the re-ordering, segment overlap deconfliction, etc. necessary. This would support important optimizations on both sides of the interface. It should also be considered acceptable for the NIDS to punt in situations where the application, not the network stack, fragments the payload but that fragmentation is visible in the network application layer protocol. For example, it would be desirable for the NIDS to reassemble HTTP 206 fragments, if possible, but I can see the argument that this reassembly can be pushed on the external analyzer, especially considering that there are situations where completely reassembling the payload object simply isn’t possible.
It should be clear that any sort of defragmentation required for analyzing files, such as javascript includes or multi-file archives, is the responsibility of the object analysis framework.

Filtering

To be effective, any system of this sort should support filtering of objects before passing them out for external analysis. While there is always room for improvement, I think most NIDS are already addressing this in some manner. The external file analysis framework probably needs to support some filtering internally, in order to route files to the appropriate scanning modules. Razorback is a good example here.
It would be possible for the interface between the two to support some sort of sampling of payloads whereby the external analyzer is given the first segment of a payload object and given the option to analyze or ignore the rest of it. I consider this unnecessary complication. I think some would desire and it could be natural for the interface to support feedback every time the NIDS provides payload segments (assuming payloads are provided in fragments) to the external analyzer. I personally don’t see this as a top priority because unless you are doing the most superficial analysis or are operating on extremely simple objects, the external analysis framework will not be able to provide reliable feedback until the full object has been seen. While both the NIDS and the external analysis framework will likely want to implement object filtering, there is no strong driver for this filtering to be built directly into the interface and this is probably just extra baggage.

Use Cases

To make sure what I’m describing is clear, I offer up 3 use cases that cover most of the critical requirements. These use cases should be easy to implement in practice and would likely be useful to the community.
Use Case 1: External Signature Matching
Use something like yara to do external signature matching. Yara provides greater expressiveness in signatures than most NIDS natively support and many organizations have a large corpus of signatures for client objects like PDF documents, PE executables, etc. If not yara, then another signature matching solution, like one of the many AV libraries (say clam) would be an acceptable test. The flags/signature names should be returned to the NIDS. This should be fast enough that this can be done synchronously.
Use Case 2: External Object Analysis Framework
Use something like Razorback to do flexible and iterative file decoding and analysis. Since Razorback is purely passive and intentionally is not bound by tight real time constraints, no feedback is sent back into the NIDS and the scanning is done asynchronously. All alerting is done by Razorback, independent of the NIDS. For extra credit, splice in your interface between Snort as a Collector (used to collect objects to send into Razorback) and Razorback.
Use Case 3: External File Metadata Extraction
Use something to do client file metadata extraction. Feed this metadata back into the NIDS to be incorporated back into the signature/alerting/auditing/logging framework. Unfortunately, I don’t have any really good recommendations of specific metadata extractors to use. One could use libmagic or something similar, but some NIDS already incorporate this or something like it to do filtering of objects, etc. Magic number inspection doesn’t quite get to the level of file metadata I’m looking for anyway. There are a few file metadata extraction libraries out there, but I can’t personally I recommend any of them for this application. As such, I’ve written a simple C function/program, docauth.c, that does simple document author extraction from a few of the most commonly used (and pwnd) document formats. This should be easy enough to integrate with the type of interface I envision. In thiscase, you would be able to provide file metadata back into the NIDS. This NIDS could incorporate this metadata into its detection engine. This should be fast enough to be able to be done synchronously.
Use Case 4: File Dumper
This is a lame use case, but I include it to show that the most common way to support external file analysis in NIDS, dumping files to a directory, can be achieved easily using the proposed interface. This relieves the NIDS of dealing with the files. The NIDS provides the file to be scanned and metadata about the file to the interface. The external analyzer simply writes this to disk. There is no feedback.

Requirements:

I’ll also provide what I think are reasonable requirements (intentionally lacking formality) for a standard interface used between a NIDS and an external client object analysis system.

The interface will provide the NIDS the capability to:
  • Pass files to an external system 
    • Files may be passed in ordered fragments
  • Pass metadata associated with the files to the external system
    • The metadata is provided in a standardized format, but the actual meaning of the data passed is not defined by the interface, rather it must be agreed upon by the NIDS and the external analyzer.
  • Optionally receive feedback from the external analyzer asynchronously
  • Optionally timeout external analysis if it exceed configured threshold 
The NIDS performs the following functionality:
  • Normalization (decoding, decompression, reassembly, etc) of all network protocols to extract files as would be seen by clients
  • Provide relevant metadata in conjunction extracted files for external analysis
  • Optionally, provide mechanisms to filter objects being passed to external analyzer
  • Optionally, incorporate feedback from external analysis into the NIDS detection engine
The Object analysis framework performs the following:
  • Receive objects and metadata from the NIDS
  • Perform analysis, which may include object decoding
  • Optionally, provided feedback to the NIDS

Conclusion

I’ve provided my opinion that it is in the best interest of the security community to have a standardized interface between NIDS and client object analysis. This would provide flexibility for users. It would help NIDS remain relevant while allowing them to stay focused. I envision interface as a library that supports providing objects and metadata out to an object analyzer and receive feedback in return.
I’m confident I’m not the only one that sees the need for abstraction between NIDS and object scanning, but I hope this article helps advance this cause. NIDS developers are already moving towards greater support for object analysis. I’d be gratified if the ideas presented here help shape, or at least help confirm demand, for this functionality. It’s important that users request more mature support of object analysis in NIDS and push for standardization across the various projects/vendors.