Saturday, March 26, 2011

Passive Network Monitoring of Strong Authentication

There’s been a fair amount of consternation and FUD concerning the effectiveness of “strong authentication” in defending against APT. For example, in their M-trends 2011 report, Mandiant has demonstrated how smart cards are being subverted. If that isn’t bad enough, RSA has recently revealed that they’ve been victim of attacks that they believe are attributed to APT and which resulted in attackers getting access to information that may weaken the effectiveness of SecureID.

Unfortunately, like most people blogging about these issues, I can’t provide any more authoritative information on the topic other than to say that based on my personal experience, targeting and subverting strong authentication mechanisms is a common practice for some targeted, persistent attackers. It’s hard to predict the impact of any of these weaknesses. Additionally, people who have found out the hard way usually aren’t particularly open about sharing their hard knocks.

Nevertheless, I’d like to advance the suitability of passive network monitoring as a method for helping to audit authentication, especially strong authentication mechanisms. While auditing is more properly conducted using logs provided by the devices that actually perform authentication (and authorization, access control, etc if you want to be pedantic), there are real operational and organization issues that may well make passive network monitoring one of the most effective means of gathering the information necessary to perform auditing of strong authentication.

The vast majority of password based authentication mechanisms bundle the username with the password and provide both to the server either in the clear or encrypted. It is possible to provide the username in the clear and the password encrypted which would improve monitoring capabilities at the possible expense of privacy. In general, this bundling of credentials is done because confidentiality is provided through mechanisms that operate at a different layer of the stack: ex. username and password sent through SSL tunnel.

On the other hand, many authentication mechanisms provide the username/user identifier in the clear. For these protocols, passive network monitoring provides the ability to collect information necessary to provide some amount of auditing of user activity. In this post I advance two quick and dirty examples of how this information could be collected. For and simplicity’s and brevity’s sake, I’ll focus solely on collecting usernames. I’ve chosen two protocols that are very frequently used in conjunction with the strong authentication mechanisms: RADIUS and SSL/TLS client certificate authentication.

RADIUS


RADIUS isn’t exactly as the most secure authentication protocol in the world. Since it has some serious weaknesses, it’s normally not used over hostile networks (like the internet). However, it is frequently used internally to organizations. In fact, it is very frequently used in conjunction with strong credentials such as RSA SecureID. One nice thing about RADIUS is that the username is passed in the clear in authentication requests. As such it’s pretty simple to build a monitoring tool to expose this data to auditing.

In my example of monitoring RADIUS, I’ll use this packet capture taken from the testing data sets for libtrace.

In my experience tcpdump is very useful for monitoring and parsing older and simpler protocols, especially ones that usually don’t span multiple packets, like DNS or RADIUS. The following is shows how tcpdump parses one RADIUS authentication request:



/usr/sbin/tcpdump -nn -r radius.pcap -s 0 -v "dst port 1812" -c 1
reading from file radius.pcap, link-type EN10MB (Ethernet)
18:42:58.228064 IP (tos 0x0, ttl 64, id 47223, offset 0, flags [DF], proto: UDP (17), length: 179) 10.1.12.20.1034 > 192.107.171.165.1812: RADIUS, length: 151
Access Request (1), id: 0x2e, Authenticator: 36ea5ffd15130961caafc039b5909d34
Username Attribute (1), length: 6, Value: test
NAS IP Address Attribute (4), length: 6, Value: 10.1.12.20
NAS Port Attribute (5), length: 6, Value: 0
Called Station Attribute (30), length: 31, Value: 00-02-6F-21-EC-52:CRCnet-test
Calling Station Attribute (31), length: 19, Value: 00-02-6F-21-EC-5F
Framed MTU Attribute (12), length: 6, Value: 1400
NAS Port Type Attribute (61), length: 6, Value: Wireless - IEEE 802.11
Connect Info Attribute (77), length: 22, Value: CONNECT 0Mbps 802.11
EAP Message Attribute (79), length: 11, Value: .
Message Authentication Attribute (80), length: 18, Value: ...eE.*.B.._..).


Note that we intentionally haven’t turned the verbosity up all the way. While there’s a lot of other good info in there, let say we only want to extract the UDP quad and the username and then send them to our SIMS so we can audit them. Assuming a configuration of syslog that sends logs somewhere to be audited appropriately, the following demonstrates how to do so:



tcpdump -nn -r radius.pcap -s 0 -v "dst port 1812" | awk '{ if ( $1 ~ "^[0-9][0-9]:" ) { print SRC" "DST" "USER; SRC=$18; DST=$20; USER="" }; if ( $0 ~ " Username Attribute" ) { USER=$NF } }' | logger -t radius_request


This example generates syslogs that appears as follows:



Mar 26 14:45:15 monitor radius_request: 10.1.12.20.1034 192.107.171.165.1812: test
Mar 26 14:45:15 monitor radius_request: 10.1.12.20.1034 192.107.171.165.1812: test
Mar 26 14:45:15 monitor radius_request: 10.1.12.20.1034 192.107.171.165.1812: test


I’ve done no significant validation to ensure that it’s complete, but this very well could be used on a large corporate network as is. Obviously, you’d need to replace the -r pcapfile with the appropriate -i interface.

SSL/TLS Client Certificate


Another opportunity for simple passive monitoring is SSL/TLS when a client certificate is used. It is very common for this mechanism to be used to authenticate users with either soft or hard (ie. smart card) certificates to web sites. This mechanism relies on PKI which involves the use of a public and private key. While the private key should never be transferred over the network, and in many cases they never leave smart cards, the public keys are openly shared. In the case of SSL/TLS client certificate based authentication the public key, along with other information such as the client user identification, is passed in the clear during authentication as the client certificate.

To have data for this example, I generated my own. I took the following steps based on the wireshark SSL wiki:



openssl req -new -x509 -out server.pem -nodes -keyout privkey.pem -subj /CN=localhost/O=pwned/C=US
openssl req -new -x509 -nodes -out client.pem -keyout client.key -subj /CN=Foobar/O=pwned/C=US

openssl s_server -ssl3 -cipher AES256-SHA -accept 4443 -www -CAfile client.pem -verify 1 -key privkey.pem

#start another shell
tcpdump -i lo -s 0 -w ssl_client.pcap "tcp port 4443"

#start another shell
(echo GET / HTTP/1.0; echo ; sleep 1) | openssl s_client -connect localhost:4443 -ssl3 -cert client.pem -key client.key

#kill tcpdump and server

#fix pcap by converting back to 443 and fixing checksums (offload problem)
tcprewrite --fixcsum --portmap=4443:443 --infile=ssl_client.pcap --outfile=ssl_client_443.pcap


You can download the resulting pcap here.

The client certificate appears as follows:



$ openssl x509 -in client.pem -noout -text
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
b0:cc:6b:94:b4:83:0f:78
Signature Algorithm: sha1WithRSAEncryption
Issuer: CN=Foobar, O=pwned, C=US
Validity
Not Before: Mar 26 13:13:12 2011 GMT
Not After : Apr 25 13:13:12 2011 GMT
Subject: CN=Foobar, O=pwned, C=US
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
RSA Public Key: (1024 bit)
Modulus (1024 bit):
00:e5:d6:78:cd:95:4e:89:0c:88:bd:78:98:26:86:
0b:f1:be:df:85:98:a2:93:c1:66:65:44:d2:aa:08:
69:2d:4c:a9:9d:50:08:79:1d:58:6e:6d:b4:2b:24:
ca:37:90:d6:91:9f:6d:73:5f:51:5a:10:af:f0:ce:
85:85:d6:e4:42:7b:ca:b0:af:0c:52:8b:60:1c:5b:
3f:54:10:cc:c4:35:18:a8:a6:a7:c8:ae:df:b7:ab:
a9:d9:20:cf:f7:5c:43:01:2e:12:cf:96:45:87:e7:
7e:87:f7:5e:8f:25:23:1b:ee:bd:0a:79:48:07:99:
ba:cc:68:16:53:43:56:e9:a1
Exponent: 65537 (0x10001)
X509v3 extensions:
X509v3 Subject Key Identifier:
BD:C2:84:BF:76:17:B7:15:BC:2F:8C:7E:A6:E6:18:B1:47:60:A3:B6
X509v3 Authority Key Identifier:
keyid:BD:C2:84:BF:76:17:B7:15:BC:2F:8C:7E:A6:E6:18:B1:47:60:A3:B6
DirName:/CN=Foobar/O=pwned/C=US
serial:B0:CC:6B:94:B4:83:0F:78

X509v3 Basic Constraints:
CA:TRUE
Signature Algorithm: sha1WithRSAEncryption
4c:28:ea:47:20:38:d5:17:dd:cf:aa:f8:13:3e:d0:5f:cf:05:
7d:c7:a1:c3:f4:3e:d7:db:56:f7:d4:d6:d6:c6:f4:5c:47:5b:
99:f6:9c:23:2d:dc:75:ab:51:8b:96:df:26:3b:9e:59:8f:2c:
08:d1:84:bf:4f:98:65:b4:0f:b7:32:9d:2f:eb:d9:a5:a6:69:
b6:75:ce:03:f4:ad:3b:f2:e6:3a:a1:ff:44:ea:8a:98:40:34:
cc:dd:e0:d8:35:0e:8b:97:20:30:e4:7b:07:52:98:63:11:32:
5e:6e:cb:c7:f1:10:67:1c:cd:e2:03:3a:99:98:8b:2f:f8:94:
03:6f


For auditing, we are interested in extracting the CN, which in this case is “Foobar”. As the client certificate is transferred over the network, the CN appears as follows:



000002e0 00 3f 0d 00 00 37 02 01 02 00 32 00 30 30 2e 31 |.?...7....2.00.1|
000002f0 0f 30 0d 06 03 55 04 03 13 06 46 6f 6f 62 61 72 |.0...U....Foobar|
00000300 31 0e 30 0c 06 03 55 04 0a 13 05 70 77 6e 65 64 |1.0...U....pwned|
00000310 31 0b 30 09 06 03 55 04 06 13 02 55 53 0e 00 00 |1.0...U....US...|


Immediately preceding the string “Foobar” is following sequence (in hex):



06 03 55 04 03 13 06


I’m not 100% sure what the "06 03" is for, but I believe this to be invariant in client certificates (if not, this example needs fixing). The "55 04 03" is indicative of the following data being a CN. This is an x509/ASN.1 thing where this sequence maps to the OID 2.5.4.3. The "13" can vary among a few common values (it specifies the data type) and the "06" indicates the length of the data (6 ASCII characters). Using this knowledge of SSL certificates we can create a tool to extract and log all CNs as follows:



$ mkdir /dev/shm/ssl_client_streams
$ cd /dev/shm/ssl_client_streams/
$ vortex -r ssl_client_443.pcap -S 0 -C 10240 -g "svr port 443" | xargs -t -I+ pcregrep -o -H "\x06\x03\x55\x04\x03..[A-Za-z0-9]{1,100}" + | sed -r "s/\x06\x03\x55\x04\x03../ /" | sed 's/c/ /' | logger -t client_cert



This generates logs as follows:



Mar 26 15:26:05 sr2s4 client_cert: 127.0.0.1:41143 127.0.0.1:443: localhost1
Mar 26 15:26:05 sr2s4 client_cert: 127.0.0.1:41143 127.0.0.1:443: localhost1
Mar 26 15:26:05 sr2s4 client_cert: 127.0.0.1:41143 127.0.0.1:443: Foobar1


If you are new to vortex, check out my vortex howto series. Basically we’re snarfing the first 10k of SSL streams transferred from the client to the server as files then analyzing them. Note that since we’re pulling all CNs out of all the certificates in the certificate chain provided by the client, we’re getting not only “Foobar” but “localhost” who is the CA in this case. Also note the trailing garbage we were too lazy to remove.

While this works, this is a little too dirty even for me. The biggest problem is that the streams which are snarfed by vortex are never purged. Second, we’re doing a lot of work in an inefficient manner on each SSL stream, even those that don’t include client certs.

Let’s refactor this slightly. First, we’re going to immediately weed out all stream we don’t want look at. In this example I’m looking for client certs in general, but you could easily change signature to be the CA for the certificates which you are interested in monitoring. Ex. “Pwned Org CA”:



$ vortex -e -r ssl_client_443.pcap -S 0 -C 10240 -g "svr port 443" | xargs pcregrep -L "\x06\x03\x55\x04\x03" | xargs rm


That will leave all the streams which we want to inspect in the current dir. If we do something like the following in an infinite loop or very frequent cron job, then we’ll do the logging and purging we need:



find -cmin +1 -type f | while read file
do
pcregrep -o -H "\x06\x03\x55\x04\x03..[A-Za-z0-9]{1,100}" $file | sed -r "s/\x06\x03\x55\x04\x03../ /" | sed 's/c/ /' | logger -t client_cert
rm $file
done


This implementation is also probably suitable for use on a large network or pretty close to it.

For these examples, it’s assumed that the logs are streamed to a log storage, aggregation, or correlation tool for real time auditing or for historical forensics. I would not be surprised if there were flaws in the examples as presented, so use at your own risk or perform the validation and tweaking necessary for your environment. These examples are intended to be merely that—to show the feasibility. While I’ve discussed two specific protocols/mechanisms there are others that lend themselves to passive network monitoring as well as many that don’t.

In this post I’ve shown how passive network monitoring could be used to help audit the use or misuse of strong authentication mechanisms. I’ve given quick and dirty examples which are probably suitable or are close to something that would be suitable for use on enterprise networks. Notwithstanding the weaknesses in my examples, I hope they provide ideas for what can be done to “trust, but verify” strong authentication mechanisms through data collection done on passive network sensors.

Saturday, March 19, 2011

Update on Ruminate

It’s been a couple weeks, but I wanted to say a little bit about the Feb 26 Release of Ruminate.

Who should be interested in Ruminate?


This release is close to level of refinement and capabilities necessary for use in an operational environment. Ruminate will be useful for people who are willing to spend extensive effort integrating their own network monitoring tools. I doubt very few people will want to use it in exactly how it is out of the box, but many of the components or even the whole framework (with custom detections running on top) may be useful to others. Ruminate as currently constituted is not for those who want a simple install process. Ruminate doesn’t do alerting or event correlation. It is up to the user to integrate Ruminate with an external log correlation and alerting system.

The Good


I think the Ruminate architecture is very promising. It makes some things that are very hard to do in conventional NIDS look very easy. The following diagram shows the layout of the Ruminate components:




If you are totally new to Ruminate, I still suggest reading the technical report.

The improved HTTP parser is pretty groovy. I’m really pleased with the attempts I’m making at things like HTTP 206 defrag. I think my HTTP log format, which includes compact single character flags inspired by network flow records (e.g. argus), is pretty cute.

Since I haven’t documented it anywhere else, let me do it here. The fields in the logs (with examples) are as follows:



Jan 12 01:47:39 node1 http[26350]: tcp-198786717-1294814857-1294814859-c-33510-10.101.84.70:10977c129.174.93.161:80_http-0 1.1 GET cs.gmu.edu /~tr-admin/papers/GMU-CS-TR-2010-20.pdf 0 32768 206 1292442029 application/pdf TG ALHEk http://cs.gmu.edu/~tr-admin/papers/GMU-CS-TR-2010-20.pdf - "zh-CN,zh;q=0.8" "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.224 Safari/534.10" "Apache"

Transaction ID: tcp-198786717-1294814857-1294814859-c-33510-10.101.84.70:10977c129.174.93.161:80_http-0
Request Version: 1.1
Request Method: GET
Request Host: cs.gmu.edu
Request Resource: /~tr-admin/papers/GMU-CS-TR-2010-20.pdf
Request Payload Size: 0
Response Payload Size: 32768
Response Code: 206
Response Last-Modified (unix timestamp): 1292442029
Response Content-Type: application/pdf
Response Flags: TG
Request Flags: ALHEk
Request Referer: http://cs.gmu.edu/~tr-admin/papers/GMU-CS-TR-2010-20.pdf
Request X-Forwarded-For: -
Request Accept-Language (in quotes): "zh-CN,zh;q=0.8"
Request User-Agent (in quotes): "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.224 Safari/534.10"
Response Server (in quotes): "Apache"


The Request Flags are as follows:


C => existence of "Cookie" header
Z => existence of "Authorization" header
T => existence of "Date" header
F => existence of "From" header
A => existence of "Accept" header
L => existence of "Accept-Language" header
H => existence of "Accept-Charset" header
E => existence of "Accept-Encoding" header
k => "keep-alive" value in Connection
c => "close" value in Connection
o => other value in Connection
V => existence of "Via" header


The Response Flags are as follows:


C => existence of "Set-Cookie" header
t => existence of "Transfer-Encoding" header, presumably chunked
g => gzip content encoding
d => deflate content encoding
o => other content encoding
T => existence of "Date" header
L => existence of "Location" header
V => existence of "Via" header
G => existence of "ETag" header
P => existence of "X-Powered-By" header
i => starts with inline for Content-Disposition
a => starts with attach for Content-Disposition
f => starts with form-d for Content-Disposition
c => other Content-Disposition


While not standard in any way, this log format should be very useful for my research.


The Bad


Ruminate is rough. It’s nowhere near the level of refinement of the leading NIDS. This is not likely to change in the short term.
Ruminate is based on a really old version of vortex. There are lots of reasons this isn’t optimal but the biggest issue is performance on high speed networks. Soon I’ll release a new version that is either based on the latest version of vortex or one that is totally separate from, but dependent on, vortex.

Yara Everywhere


I’ve added yara to the basically every layer of Ruminate. This is useful for those in operational environments because many people are used to and have existing signatures written for yara. Since Ruminate is very object focused (not network focused), yara makes a lot of sense. While applying signatures to raw streams is not what Ruminate is about, it was easy to do and may even be useful for environments struggling with limitations in signature matching NIDS. Lastly, the use of yara, with its extensive meta-signature rule definitions, helps fill a gap in Ruminate which can’t reasonably be filled by an external event correlation engine.

Ruminate or Razorback (or both)


I’ve been asked, and it’s a good question, how Ruminate and Razorback compare. Before I express my candid opinions, I want to say that I’m very pleased with what the VRT guys are doing with Razorback. While there is some overlap in what I’m doing and what they’re doing (at least in high level goals), there’s more than enough room for multiple people innovating in the network payload object analysis space. If nothing else, the more people in the space, the more legitimate the problem of analyzing client payload objects (file) becomes. It seems unfathomable to me, but there are many who still question the value of using NIDS for detecting attacks against client applications (Adobe, IE) versus the traditional server exploits (IIS, WuFTP) or detection of today’s reconnaissance (google search) versus old school reconnaissance (port scan).

To date, Ruminate’s unique contributions are very much focused on scalable payload object collection, decoding, and reconstruction. Notable features include dynamic and highly scalable load balancing of network streams, full protocol decoding for HTTP and SMTP/MIME, and object fragment reassembly (ex. HTTP 206 defrag). If you want to comprehensively analyze payloads transferred through a large network, Ruminate is the best openly available tool for exposing the objects to analysis. The actual object analysis is pretty loose in Ruminate today, but is definitely simple and scalable. Ruminate’s biggest shortcoming is its rough implementation and relatively low level of refinement. This isn’t a problem for academia and other research, but it is a barrier to widespread adoption.

Razorback is largely tackling the other end of the problem—what to do once you get the objects off the network (or host or other source for that matter). Razorback has a robust and well defined framework for client object analysis. While definitely in early beta state, Razorback is a whole lot more refined and “cleaner” than Ruminate. Razorback has centrally controlled object distribution model which has obvious advantages and disadvantages over what Ruminate is doing. Razorbacks’ limitations in network payload object extraction are inherited largely from it’s reliance on the Snort 2.0 framework, which to be fair, was never designed for this sort of analysis.

While I’ve never actually done it, if there was a brave soul who wanted to combine the best of both Ruminate and Razorback, it would be possible to use Ruminate to extract objects off the network and use Razorback to analyze the objects. Using the parlance of both respectively, one could modify Ruminate’s object multiplexer (object_mux) to be a collector for Razorback. The point I'm trying to make is that the innovations found in Ruminate and Razorback may be more complimentary than competing.

Take what you want (or leave it)


I’m sharing what I’ve implemented in hopes that it helps advance academic research and the solutions used in industry. Please take Ruminate as a whole, some components, or simply the ideas or paradigm and run with them. I’m always interested in hearing feedback on Ruminate or the ideas it advances. I’m also open to working with others on research using or continued development of Ruminate.