Saturday, April 24, 2010

Vortex Howto Series: Near Real-Time IDS

This installment of the vortex howto series will build upon previous installments to demonstrate additional features of vortex relevant to implementing a near-real time IDS.

Most mainstream IDSs are extremely packet focused. There are many reasons for this, but at least one of these is in order to support IPS where the “P” is for prevention. The rationale is that to block attacks, one must be able to make a decision on whether to block or pass a packet in a very short period of time. Conventional IDSs focus heavily on efficiency, usually having a very strict C API for analysis modules.

Vortex supports a very different philosophy. Vortex takes a stream-centric approach. The focus is on supporting analysis on the data traveling through the network, not the mechanism for transporting the data (packets). Vortex doesn’t even try to support preventing attacks but focuses on facilitating deep analysis of network payload data, especially processor intensive or high latency analysis. Vortex has a very flexibly API, one which anyone familiar with Linux/Unix will appreciate. I think of it is as a find command for network payload data.

For this installment we’re going to improve upon the example provided in the readme. We’re going to use ssdeep to do fuzzy hash comparisons against known attack signatures. We’ll call our IDS ssdeep-n. We’re using ssdeep because it’s relatively computationally expense. Actually, it’s extremely slow. While ssdeep has a very easy to use API, we’re intentionally not going to use it because we want to demonstrate the ability to use vortex to take any Unix command line tool and use it for network analysis.

So without further ado, here is our analyzer:
#!/bin/bash
#simple script to run ssdeep on network stream (or any list of files)
#output should be piped to log file or logging system (logger)


while read file
do
result=`ssdeep -m /etc/ssdeep-n.sigs -b $file`
if ! echo $result | grep matches > /dev/null
then
rm $file
else
mv $file /var/lib/ssdeep-n/hits/
echo $result | sed 's/ \/etc\/ssdeep-n\.sigs:/ /g'
fi
done
You can download it here.

While contrived and not the most efficient solution, this is sufficiently generalized to be representative of what could be done with basically any Unix command, including those that don’t support multiple files per invocation or situations where you need to capture/parse the output of the command. We execute ssdeep on the stream file provided by vortex and capture the output. We check the captured output for what we find interesting. If we don’t detect a match, we purge the stream. If we do detect a match, we archive the stream file to /var/lib/ssdeep-n/hits/ and output an alert, massaging the alert text a small amount.

For a data set, the defcon17 CTF packet captures will be used. I downloaded the packet captures and used mergecap to combined them back into one pcap with the following properties:
$ capinfos ctf_dc17.pcap
File name: ctf_dc17.pcap
File type: Wireshark/tcpdump/... - libpcap
File encapsulation: Ethernet
Number of packets: 38994342
File size: 7780760337 bytes
Data size: 7156850841 bytes
Capture duration: 185602.101865 seconds
Start time: Fri Jul 31 13:26:38 2009
End time: Sun Aug 2 17:00:00 2009
Data rate: 38560.18 bytes/s
Data rate: 308481.46 bits/s
Average packet size: 183.54 bytes
Closely related to the data set, is the signature set we’ll be using. You can download it from here. The signature file contains ssdeep hashes for an assortment of attack data, some of which will match against the defcon 17 data set. Fearing to depart too much from the standards set by the security industry, the signature names are painfully useless :)

Now we’re ready to actually get our near real time IDS to run. Based on the knowledge from some of the previous articles in this series, the following is a good starting point:
$ vortex -r ctf_dc17.pcap -e -t /dev/shm/ssdeep-n \
-S 1000000000 -C 1000000000 |./ssdeep-n.sh
One of the most important vortex options, at least for those of us that care about security, is the -u option. Live captures usually require root privileges to open the capture device but we’d like to not run as root any longer than necessary. The -u option tells vortex to suid down to a non-root user after opening the capture device/file. Changing the command so it can be executed as root, but quickly dropping to the use of the user nobody, which has limited permission, yields the following:
# vortex -r ctf_dc17.pcap -u nobody -e -t /dev/shm/ssdeep-n \
-S 1000000000 -C 1000000000 | su nobody -c './ssdeep-n.sh'
While we aren’t reading from a live interface, we very easily could be. We’re using su so the analyzer runs with the non-root account also.

Libnids, on which vortex is built, has some statically sized hash tables. In general we want these hash tables to be large enough that they are never filled, but not too much larger than necessary as they consume a fair amount of memory. One of these hash tables is the main connection hash table. Each active connection which vortex is capturing requires an entry in this hash table. When this hash table fills up, vortex ignores additional connections until active connections are closed. The default value of 1M is pretty good, but for demonstrative purposes, we’re going to set this to 2M by using -s 2097152. You will know you need to increase this if you ever have errors of the category “TCP_LIMIT”. Similarly, libnids has a static hash table for IP Frag which can be set with -H. We’ll leave this at the default, but if you have a network where IP frag is actually used routinely, you may want to increase this.

Vortex doesn’t provide the data to the external analyzer until all the requested data from the stream has been gathered or until the connection has successfully closed. For various reasons, vortex can’t always detect when communication has terminated. To prevent connections from being followed indefinitely, even after the connections have been abandoned by one or both ends, the -K option provides a timeout. Note however, that this timeout is only reset when data is transferred through the connection, not when other possibly valid TCP traffic, such as keepalives, ACKs, etc are observed. Vortex has an especially hard time detecting the end of many of the connections in the defcon data set we are using, so we definitely need to set this option. In practice, the -K option also helps guard against benign or malicious resource exhaustion. Common settings of this range from 1s to 3600s. We’ll set this to 600s with -K 600.

Adding the hash table size options and timeout yields:
# vortex -r ctf_dc17.pcap -u nobody -s 2097152 -K 600 \
-e -t /dev/shm/ssdeep-n -S 1000000000 -C 1000000000 \
| su nobody -c './ssdeep-n.sh'
Another important aspect of running vortex for long periods of time, as you would do with a near-real time IDS, is logging of health/status. By default vortex dumps error and performance stats at program termination, but vortex can be configured to dump this data periodically. The -E and -T set the reporting interval for error and performance statistics which are output to syslog and STDERR. We’ll use 3600 for each so we get stats back every hour. The -L option sets the syslog tag so that different instances of vortex can be differentiated from each other. We’ll use -L ssdeep-n.

One subtle item of note here is that while basically all aspects of vortex timings are based on the time loaded from the packet captures, either live or dead, the periods for error and performance stats logging are implemented in system time (not pcap time). In this example, we’ll see the multi-day packet capture processed in a couple hours. The times from the packet captures, including the -K idle timeout will be based on pcap time, while the error and stats messages will be based on local system time.

Adding logging yields the following:
# vortex -r ctf_dc17.pcap -u nobody -s 2097152 -K 600 \
-e -t /dev/shm/ssdeep-n -E 3600 -T 3600 -L ssdeep-n \
-S 1000000000 -C 1000000000 | su nobody -c './ssdeep-n.sh \
| logger -s -p local0.info -t ssdeep-n'
We’re taking the output of ssdeep-n and feeding it to logger such that logs are echoed back to the terminal via STDOUT and sent to system log.

So now we’re ready to actually run our near real time IDS.

The results look something like the following:
Apr 24 15:23:18 localhost ssdeep-n: VORTEX_STATS PCAP_RECV: 0
PCAP_DROP: 0 VTX_BYTES: 0 VTX_EST: 0 VTX_WAIT: 0
VTX_CLOSE_TOT: 0 VTX_CLOSE: 0 VTX_LIMIT: 0 VTX_POLL: 0
VTX_TIMOUT: 0 VTX_IDLE: 0 VTX_RST: 0 VTX_EXIT: 0 VTX_BSF: 0
Apr 24 15:23:18 localhost ssdeep-n: VORTEX_ERRORS TOTAL: 0
IP_SIZE: 0 IP_FRAG: 0 IP_HDR: 0 IP_SRCRT: 0 TCP_LIMIT: 0
TCP_HDR: 0 TCP_QUE: 0 TCP_FLAGS: 0 UDP_ALL: 0 SCAN_ALL: 0
VTX_RING: 0 OTHER: 0
Apr 24 15:27:10 localhost ssdeep-n: tcp-30216-1249077951
-1249079141-i-4425-10.31.8.30:53668c10.31.6.2:1787
matches Command DGB (75)
Apr 24 15:28:20 localhost ssdeep-n: tcp-56998-1249080094
-1249080224-i-155949-10.31.8.30:56248s10.31.5.2:1787
matches Response DGB (93)
Apr 24 15:28:20 localhost ssdeep-n: tcp-56998-1249080094
-1249080224-i-155949-10.31.8.30:56248c10.31.5.2:1787
matches Response DGB (93)
Apr 24 15:28:54 localhost ssdeep-n: tcp-62766-1249080436
-1249080721-i-156483-10.31.8.30:36129s10.31.6.2:1787
matches Response DGB (66)
Apr 24 15:34:30 localhost ssdeep-n: tcp-112145-1249083434
-1249083605-i-80684-10.31.8.30:36729s10.31.1.2:1787
matches Response DGB (94)
Apr 24 15:36:25 localhost ssdeep-n: tcp-129781-1249084423
-1249084581-i-80510-10.31.8.30:41222s10.31.10.2:1787
matches Response DGB (94)
Apr 24 16:23:18 localhost ssdeep-n: VORTEX_STATS PCAP_RECV: 0
PCAP_DROP: 0 VTX_BYTES: 374632450 VTX_EST: 370486 VTX_WAIT: 9999
VTX_CLOSE_TOT: 366266 VTX_CLOSE: 0 VTX_LIMIT: 0 VTX_POLL: 0
VTX_TIMOUT: 0 VTX_IDLE: 186789 VTX_RST: 179477 VTX_EXIT: 0 VTX_BSF: 0
Apr 24 16:23:18 localhost ssdeep-n: VORTEX_ERRORS TOTAL: 484
IP_SIZE: 0 IP_FRAG: 0 IP_HDR: 0 IP_SRCRT: 2 TCP_LIMIT: 0 TCP_HDR: 12
TCP_QUE: 470 TCP_FLAGS: 0 UDP_ALL: 0 SCAN_ALL: 0 VTX_RING: 0 OTHER: 0
Apr 24 16:33:29 localhost ssdeep-n: tcp-394718-1249150608
-1249150608-r-2056-10.31.5.5:47377s10.31.3.2:4343
matches Attack ABC (97)
Apr 24 16:33:30 localhost ssdeep-n: tcp-394734-1249150609
-1249150610-r-2568-10.31.5.5:32478s10.31.4.2:4343
matches Attack ABC (97)
...
Apr 24 16:49:00 localhost ssdeep-n: tcp-431134-1249152504
-1249152504-i-2056-10.31.5.5:57596s10.31.2.2:4343
matches Attack ABC (97)
Apr 24 17:23:18 localhost ssdeep-n: VORTEX_STATS PCAP_RECV: 0
PCAP_DROP: 0 VTX_BYTES: 642622346 VTX_EST: 532289 VTX_WAIT: 9999
VTX_CLOSE_TOT: 525121 VTX_CLOSE: 0 VTX_LIMIT: 0 VTX_POLL: 0
VTX_TIMOUT: 0 VTX_IDLE: 269566 VTX_RST: 255555 VTX_EXIT: 0 VTX_BSF: 0
Apr 24 17:23:18 localhost ssdeep-n: VORTEX_ERRORS TOTAL: 713
IP_SIZE: 0 IP_FRAG: 0 IP_HDR: 0 IP_SRCRT: 2 TCP_LIMIT: 0 TCP_HDR: 30
TCP_QUE: 681 TCP_FLAGS: 0 UDP_ALL: 0 SCAN_ALL: 0 VTX_RING: 0 OTHER: 0
...
A few of the signatures matched, with varying degrees of similarity. Since we’ve archived the matches, we can go examine them. For example, let’s look at one of the very popular “Attack ABC” hits:
[csmutz@master ~]$ hexdump -v /var/lib/ssdeep-n/hits\
/tcp-431134-1249152504-1249152504-i-2056-10.31.5.5:57596s\
10.31.2.2:4343 | head
0000000 9090 9090 9090 9090 9090 9090 9090 9090
0000010 9090 9090 9090 9090 9090 9090 9090 9090
0000020 7dbf b830 3110 66c9 f0b9 db01 d9d9 2474
0000030 58f4 7831 8310 04c0 7803 9f0c edc5 ba99
0000040 5975 c5cc b196 c5e6 fd66 1d82 fe98 9d72
0000050 0165 5a8d d5e0 9b73 be14 1aee 86eb 0c74
0000060 f715 c888 718e d758 4eca 2858 2b2b b48a
0000070 73a1 f051 83b4 2fa5 1321 e837 0734 0939
0000080 d8c9 f5c6 1e36 1d43 5fc8 f2b3 c55e cb35
0000090 e124 2c38 99d9 bcad 9949 a0ab da68 454b
I don’t know much about what is supposed to be going on here, but I do know that starting your conversation off with a NOP sled, is in computer etiquette, not the nicest way to start a conversation. While contrived, we’ve “detected” an attack. We could look more, but I think that’s a sufficient discussion of our results.

We’ve demonstrated how to use vortex to build a near real-time IDS. While ssdeep is probably not something you’d ever want to run on bare network streams, we’ve shown how easy it is to take basically any Unix command that operates on files, including computationally expensive ones, and apply the same functionality to network streams in near real time. While we used a program written in C with a straight-forward API, we could just have easily used a perl/python/ruby script, java program, or even VB script written for windows which runs via mono or wine. No re-implementation is required to take a valuable detection mechanisms and run it on network traffic in near real-time. I think of the most valuable things vortex could be used for is doing the type of decoding and or data extraction that just isn’t possible with mainstream IDS. For example, assuming the signature matching capabilities of Snort isn't good enough for you, what about extracting MS documents from network traffic and running officecat on them? Similarly, if you like Bro-style transaction logs for network protocols, why not extract metadata from pdfs traversing the network with pdftk or one of Didier Stevens PDF tools?

While we’ve run our near real-time IDS from a dead capture file, it could just as easily be done from live capture. Vortex includes some example init scripts that could be used to run vortex in a daemon mode, such as you would need to do for a network sensor. Vortex facilitates the creation of agile and flexible near real-time detection mechanisms.

As we’ll show in the next installment of this series, vortex removes the real-time constraints inherent in network packet capture from our content analysis. Vortex also can be used to take detection mechanisms as we’ve implemented here and scale them across highly parallel systems.

No comments:

Post a Comment