Saturday, January 22, 2011

Shameless plug for Colleagues' DC3 Presentations

Is it shameful to engage in cronyism, if you disclose it up front? I hope not.

While I’m not going to be attending the DoD Cyber Crime Conference this year, I’d like to draw attention to some of my colleagues who will be. Since I’m not attending, I haven’t looked at who else is speaking.

Sam Wenck, who co-presented with me last year and works side by side with me daily, is presenting on Threat Intelligence Knowledge Management for Incident Response. In essence, he’ll be speaking on how to implement the technology necessary to support intelligence driven CND. If you are interesting in improving your organization’s ability to record, maintain, and leverage threat intelligence, you should attend.

Kieth Gould will be speaking to the title of “When did it happen? Are you sure about that?” I believe the original title of this preso was “How to score a date with your PC” (which Kieth routinely does). Frankly, I’m just not deep enough into host based forensics to fully appreciate the subject matter. Kieth has a reputation for his aptitude for and thorough attention to esoteric technical detail. This presentation might break the Geek Meter scale.

Having had previews of the content, I expect both these presentations to contain an abundance of pragmatic technical content and be free from annoying marketing rhetoric.

I also believe Mike Cloppert is going to be on a panel (not sure which one), but he doesn’t need any help drawing crowds.

Thursday, January 13, 2011

Gnawing on HTTP 206 Fragmented Payloads with Ruminate

I've been madly working on getting Ruminate to a point where I can recommend it to people in industry for use, hopefully by the end of January 2011. I've done a huge amount of work on HTTP decoding including a working implementation of HTTP 206 defragmentation which I consider a "killer feature" when dealing with payloads transferred through the network. I wanted to take a break from the documentation and code packaging that Ruminate so badly needs to discuss the importance of this mechanism, along with some examples. This discussion should also help clarify the areas where Ruminate is seeking to innovate.

HTTP 206 Partial Content



As NIDS begin to earnestly address true layer 7 decoding and embedded object analysis (ex. files transferred through network), they will run into complications like HTTP 206. I haven't heard much about HTTP 206 defrag so I assume this isn't on most people's radar.

What is HTTP 206? It's basically HTTP's method of fragmenting payload objects. 206 is the response code, just like 200 or 404. If you want to download just part of a file, you can ask the server to give you a specific set (or sets) of bytes and compliant servers will respond with only the data you asked for via a 206 response.

If you're not looking for malicious content in HTTP 206 transactions, you should be. Who really cares about HTTP 206 transactions if they represent a very small number of total HTTP transactions on a network? One oft overlooked detail is that HTTP 206 is actually used to transfer a significant amount (often up to 20%) of the most interesting payloads, such as PDF documents or PE executables. Even though HTTP 206 is often used naively by unwitting clients, it is used to transfer malicious content just as well as benign content, making life harder for your NIDS in the process.

Layer 7 and Embedded Object Defrag


One of Ruminate's goals is to address layer 7 and payload object analysis with the same level of vigor that current NIDS address layer 3 and layer 4. Part of this analysis necessarily involves layer 7 and payload object defrag/reassembly just like layer 3 and layer 4 defrag/reassembly have been big topics for the current generation NIDS. HTTP 206 is a perfect example of layer 7 fragmentation that is loosely analogous to ipfrag, etc. What is an example of client application object fragmentation? Imagine you have malicious javascript and you want to evade NIDS that are smart enough to decode basic javascript obfuscation like hex armoring. One option is to split your javascript across multiple files (which all get included at run time), possibly across multiple servers/domains.

The next release of Ruminate will include thousands of lines of new and improved HTTP parsing code, including a new 206defrag service. When individual HTTP parser node comes across a HTTP 206 response, it feeds the fragmented payload to the 206defrag service which does the defragmentation. When 206defrag service has all the pieces of the file, the reassembled payload is passed through the object multiplexer to the appropriate analysis service(s), ex. PDF.

I'm very pleased at the progress I've made to address HTTP 206. First of all, it actually works! In operation so far, I've been able to look at a lot of interesting payloads that I wouldn't have been able to otherwise.

I wanted to share some examples that demonstrate uses of HTTP 206 in the wild. The first example will be very straightforward and is the type of thing you’ll see most often. The other two examples demonstrate characteristics that are less common, but still happen in the real world. None of the examples were contrived or fabricated--they were taken from real network traffic that I had no direct influence on. I will however, use them to show what I believe to be useful functionality of Ruminate. I anonymized the client IP addresses, but other than that, the data is just as observed. Note that other than interesting examples of HTTP 206 in action, there is absolutely no malicious, sensitive, private or otherwise interesting data in the pcaps. The 206_examples.zip download includes the pcaps of the examples and the relevant logs from Ruminate. For those stout of heart enough to actually tinker Ruminate in its current state, I’ve also included the new HTTP code in the download also.

Example A


Example A is a canonical example of HTTP 206 fragmentation. Let’s start with the logs:

[csmutz@master 206_examples]$ cat http_a.log
Jan 12 01:47:39 node1 http[26350]: tcp-198786717-1294814857-1294814859-c-33510-10.101.84.70:10977c129.174.93.161:80_http-0 1.1 GET cs.gmu.edu /~tr-admin/papers/GMU-CS-TR-2010-20.pdf 0 32768 206 1292442029 application/pdf TG ALHEk http://cs.gmu.edu/~tr-admin/papers/GMU-CS-TR-2010-20.pdf - "zh-CN,zh;q=0.8" "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.224 Safari/534.10" "Apache"
Jan 12 01:47:39 master 206defrag: input tcp-198786717-1294814857-1294814859-c-33510-10.101.84.70:10977c129.174.93.161:80_http-0 555523 0 32768 cs.gmu.edu /~tr-admin/papers/GMU-CS-TR-2010-20.pdf 10.101.84.70
Jan 12 01:48:17 node4 http[26947]: tcp-198787353-1294814861-1294814896-c-523548-10.101.84.70:10978c129.174.93.161:80_http-0 1.1 GET cs.gmu.edu /~tr-admin/papers/GMU-CS-TR-2010-20.pdf 0 522755 206 1292442029 application/pdf TG ALHEk http://cs.gmu.edu/~tr-admin/papers/GMU-CS-TR-2010-20.pdf - "zh-CN,zh;q=0.8" "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.224 Safari/534.10" "Apache"
Jan 12 01:48:17 master 206defrag: input tcp-198787353-1294814861-1294814896-c-523548-10.101.84.70:10978c129.174.93.161:80_http-0 555523 32768 522755 cs.gmu.edu /~tr-admin/papers/GMU-CS-TR-2010-20.pdf 10.101.84.70
Jan 12 01:48:17 master 206defrag: output tcp-198786717-1294814857-1294814859-c-33510-10.101.84.70:10977c129.174.93.161:80_http-0_206defrag normal 2 555523 5a484ada9c816c0e8b6d2d3978e3f503 tcp-198786717-1294814857-1294814859-c-33510-10.101.84.70:10977c129.174.93.161:80_http-0,tcp-198787353-1294814861-1294814896-c-523548-10.101.84.70:10978c129.174.93.161:80_http-0
[csmutz@master 206_examples]$ cat object_a.log
Jan 12 01:48:17 master object_mux[11977]: tcp-198786717-1294814857-1294814859-c-33510-10.101.84.70:10977c129.174.93.161:80_http-0_206defrag 555523 5a484ada9c816c0e8b6d2d3978e3f503 pdf PDF document, version 1.4

Unfortunately I don’t have time to explain in full the log formats, etc. Hopefully I'll document that somewhere more accessible than the code soon :). The first log line demonstrates the 1st HTTP transaction where the client asks the server for the first 32k of the PDF and the server obliges.

Headers are as follows:

GET /~tr-admin/papers/GMU-CS-TR-2010-20.pdf HTTP/1.1
Host: cs.gmu.edu
Connection: keep-alive
Referer: http://cs.gmu.edu/~tr-admin/papers/GMU-CS-TR-2010-20.pdf
Accept: */*
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.224 Safari/534.10
Accept-Encoding: gzip,deflate,sdch
Accept-Language: zh-CN,zh;q=0.8
Accept-Charset: GBK,utf-8;q=0.7,*;q=0.3
Range: bytes=0-32767

HTTP/1.1 206 Partial Content
Date: Wed, 12 Jan 2011 06:47:37 GMT
Server: Apache
Last-Modified: Wed, 15 Dec 2010 19:40:29 GMT
ETag: "56010f-87a03-497781c080540"
Accept-Ranges: bytes
Content-Length: 32768
Content-Range: bytes 0-32767/555523
Connection: close
Content-Type: application/pdf

That’s all straightforward. The HTTP parser realizes that it doesn’t have a complete payload object so instead of passing it to the object multiplexer it sends it to the 206defrag service. The next log line shows the 206defrag service receiving this fragment. Since it doesn’t have the whole object yet, it holds on to it.

After sampling the first 32k, the client gets the rest of the PDF. Headers as follows:

GET /~tr-admin/papers/GMU-CS-TR-2010-20.pdf HTTP/1.1
Host: cs.gmu.edu
Connection: keep-alive
Referer: http://cs.gmu.edu/~tr-admin/papers/GMU-CS-TR-2010-20.pdf
Accept: */*
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US) AppleWebKit/534.10 (KHTML, like Gecko) Chrome/8.0.552.224 Safari/534.10
Accept-Encoding: gzip,deflate,sdch
Accept-Language: zh-CN,zh;q=0.8
Accept-Charset: GBK,utf-8;q=0.7,*;q=0.3
Range: bytes=32768-555522
If-Range: "56010f-87a03-497781c080540"

HTTP/1.1 206 Partial Content
Date: Wed, 12 Jan 2011 06:47:41 GMT
Server: Apache
Last-Modified: Wed, 15 Dec 2010 19:40:29 GMT
ETag: "56010f-87a03-497781c080540"
Accept-Ranges: bytes
Content-Length: 522755
Content-Range: bytes 32768-555522/555523
Connection: close
Content-Type: application/pdf

Again, this is very straightforward. The client gets the rest of the file. Note the “Etag” and “If-Range” headers. If clients and servers consistently used this convention it might make reassembly easier. Alas, it’s frequently not used. The server was nice enough to report a content type of “application/pdf” for both fragments, doesn’t use any other content-encoding or transfer-encoding, etc. If only all transactions were this simple!

After receiving the 2nd fragment on the 4th log line, the 206defrag service realizes it has the whole payload now. Line 5 shows the service sending this payload object off for analysis. In line 6 the object multiplexer decides to send this file on to the PDF analyzer. Not shown here, but the PDF analysis service deems this PDF well worth the time reading :)

This is a very simple and clean example of HTTP 206 fragmentation. Most uses of HTTP 206 are similar to this, even if not quite this simple. In very many cases, instead of being split across separate TCP streams, the fragments are sent serially in the same stream a la pipelined request/responses. This general scenario is very common for PDFs.

One point I’d like to make here is that if your NIDS doesn’t do HTTP 206 defrag, you loose the opportunity to analyze a significant portion of PDFs, at least any analysis that requires looking at the whole PDF at once.

Example B


Example B is interesting for a couple reasons. Again, let’s start with the logs:

[csmutz@master 206_examples]$ cat http_b.log
Jan 12 02:17:56 node4 http[27618]: tcp-198921731-1294816073-1294816075-i-936869-192.168.72.14:3254c65.54.95.206:80_http-0 1.1 GET au.download.windowsupdate.com /msdownload/update/software/uprl/2011/01/windows-kb890830-v3.15-delta_7d99803eaf3b6e8dfa3581348bc694089579d25a.exe 0 816896 206 1294342831 application/octet-stream TP AEk - - "" "Microsoft BITS/6.6" "Microsoft-IIS/7.5"
Jan 12 02:17:56 master 206defrag: input tcp-198921731-1294816073-1294816075-i-936869-192.168.72.14:3254c65.54.95.206:80_http-0 1022920 0 816896 au.download.windowsupdate.com /msdownload/update/software/uprl/2011/01/windows-kb890830-v3.15-delta_7d99803eaf3b6e8dfa3581348bc694089579d25a.exe 192.168.72.14
Jan 12 02:17:56 node4 http[27618]: tcp-198921731-1294816073-1294816075-i-936869-192.168.72.14:3254c65.54.95.206:80_http-1 1.1 GET au.download.windowsupdate.com /msdownload/update/software/uprl/2011/01/windows-kb890830-v3.15-delta_7d99803eaf3b6e8dfa3581348bc694089579d25a.exe 0 0 - - - - AEk - - "" "Microsoft BITS/6.6" ""
Jan 12 02:33:26 node1 http[26761]: tcp-199054360-1294817575-1294817576-r-206649-192.168.72.14:3257c65.54.95.14:80_http-0 1.1 GET au.download.windowsupdate.com /msdownload/update/software/uprl/2011/01/windows-kb890830-v3.15-delta_7d99803eaf3b6e8dfa3581348bc694089579d25a.exe 0 206024 206 1294342831 application/octet-stream TP AEk - - "" "Microsoft BITS/6.6" "Microsoft-IIS/7.5"
Jan 12 02:33:26 master 206defrag: input tcp-199054360-1294817575-1294817576-r-206649-192.168.72.14:3257c65.54.95.14:80_http-0 1022920 816896 206024 au.download.windowsupdate.com /msdownload/update/software/uprl/2011/01/windows-kb890830-v3.15-delta_7d99803eaf3b6e8dfa3581348bc694089579d25a.exe 192.168.72.14
Jan 12 02:33:26 master 206defrag: output tcp-198921731-1294816073-1294816075-i-936869-192.168.72.14:3254c65.54.95.206:80_http-0_206defrag normal 2 1022920 fc13fee1d44ef737a3133f1298b21d28 tcp-198921731-1294816073-1294816075-i-936869-192.168.72.14:3254c65.54.95.206:80_http-0,tcp-199054360-1294817575-1294817576-r-206649-192.168.72.14:3257c65.54.95.14:80_http-0
[csmutz@master 206_examples]$ cat object_b.log
Jan 12 02:33:26 master object_mux[3282]: tcp-198921731-1294816073-1294816075-i-936869-192.168.72.14:3254c65.54.95.206:80_http-0_206defrag 1022920 fc13fee1d44ef737a3133f1298b21d28 null PE32 executable for MS Windows (GUI) Intel 80386 32-bit

At first glance, this looks a lot like the last example. There are some subtle but notable differences. First of all, the first tcp stream contains two requests, not one. While the first transaction looks normal, the log for the second is incomplete. The size of the response payload is “-“, there is no response code either, and none of the response headers are set. What is happening here is that Ruminate can validate and parse the request but it can’t do so with the response, so it just gives the metadata for the request. What is going on here? To find out, we’ll have to go to the packets...

Looking at packet 956, we see the second pipelined request. Presumably everything is still normal at this point:

[csmutz@master 206_examples]$ tshark -nn -r 206_example_b.pcap | grep "^956 "
956 1.259759 192.168.72.14 -> 65.54.95.206 HTTP GET /msdownload/update/software/uprl/2011/01/windows-kb890830-v3.15-delta_7d99803eaf3b6e8dfa3581348bc694089579d25a.exe HTTP/1.1

If we go farther down the packet trace we get to the point that the client receives the header for the 2nd response in packet 1213:

[csmutz@master 206_examples]$ tshark -nn -r 206_example_b.pcap | grep -C 2 "^1213 "
1211 1.407243 192.168.72.14 -> 65.54.95.206 TCP [TCP Dup ACK 1101#52] 3254 > 80 [ACK] Seq=581 Ack=899890 Win=65535 Len=0 SLE=935155 SRE=965425
1212 1.407254 65.54.95.206 -> 192.168.72.14 TCP [TCP segment of a reassembled PDU]
1213 1.407255 65.54.95.206 -> 192.168.72.14 HTTP HTTP/1.1 206 Partial Content (application/octet-stream)
1214 1.407347 192.168.72.14 -> 65.54.95.206 TCP [TCP Dup ACK 1101#53] 3254 > 80 [ACK] Seq=581 Ack=899890 Win=65535 Len=0 SLE=935155 SRE=965425
1215 1.407465 192.168.72.14 -> 65.54.95.206 TCP [TCP Dup ACK 1101#54] 3254 > 80 [ACK] Seq=581 Ack=899890 Win=65535 Len=0 SLE=935155 SRE=965425

Already we see something amiss. The client is ACKing incessantly some data at a point that is a partway into the payload of the 2nd response. As it turns out, the client never ACKs any more data, even though the server tries to ram the whole response down the client’s buffer. It appears that the whole payload for the 2nd response is transferred over the wire, but the client never ACKs it. Ruminate handles this case by assuming the client threw away the unACKed data and doing essentially the same. Since the whole response can’t be reconstructed, Ruminate punts and provides no metadata about the response in the log and doesn't send the payload fragment to the 206defrag service, considering it invalid. Some could argue that it would be nice if Ruminate was a little more promiscuous in the TCP reassembly and HTTP parsing. While I could see the argument that it would be nice to provide some information about the response, the current behavior is relatively simple and safe. I suspect that some other NIDS and network forensics utilities would actually use all the unACKed data, opening the door to analyze the whole payload at this point. I can see the appeal of this approach. I’m not 100% sure I’ve analyzed this situation correctly, but I think Ruminate does the right thing in this case.

It seems apparent that the client discarded this unACKed data because several minutes later, it requests the second fragment over again, which it receives successfully. After the client receives this second fragment, Ruminate splices it together and the exe is sent off for analysis. The interesting part about this 2nd attempt for the 2nd fragment is that this time the client chose a different mirror to download from--it’s on the same subnet but is a different IP.

I chose this example because it points out a few things. First it demonstrates how the classic layer 4 defrag accuracy problem can influence the layer 7 defrag problem. Similarly, it alludes to the same problems applied to layer 7. What do you do if layer 7, ex. HTTP 206 fragments, overlap? Which version do you keep if it’s different? Can this be used for NIDS evasion like it was in the layer 4 case? These are the type of interesting questions I hope Ruminate aids in studying.

I believe this example also helps validate some of the architecture of Ruminate, from dynamic load balancing of streams to a service based approach. Since the two layer 7 fragments were sent from distinct client/server IP pairs, you have no guarantee that the conventional method of static header load balancing would send the layer 7 fragments to the same HTTP analysis node. If you are going to do this the conventional NIDS way, you are forced accept a high cost in synchronization between the two analyzer nodes because layer 7 defrag can involve large amounts of data spread through long periods of time. The service based approach not only factors in realities of today’s commodity IT infrastructure, but makes this problem look relatively simple.

Example C



Instead of leading off with the logs for this example, I need to explain one more wrinkle of HTTP 206. I didn’t learn about this until I was trying to implement 206defrag and was disappointed that to see that many of the PDFs I tried to download on my own machine weren’t being successfully reconstructed by Ruminate (my computer almost always does HTTP 206 when downloading PDFs). If the client requests more than one byte range in a single request, the server puts the various responses in a MIME blob that separates the byte ranges much like multiple attachments to an email, but from what I’ve seen, sans the base64 encoding. If I understand correctly, this is very similar to how some POSTs are encoded.

This is how it looks in practice:

GET /courses/ECE545/viewgraphs_F04/loCarb_VHDL_small.pdf HTTP/1.1
Host: teal.gmu.edu
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 ( .NET CLR 3.5.30729; .NET4.0C) Creative ZENcast v1.02.10
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 115
Connection: keep-alive
X-REMOVED: Range
X-Behavioral-Ad-Opt-Out: 1
X-Do-Not-Track: 1
Range: bytes=1-1,0-4095

HTTP/1.1 206 Partial Content Date: Mon, 10 Jan 2011 17:02:50 GMT
Server: Apache
Last-Modified: Sat, 20 Nov 2004 02:05:07 GMT
ETag: "25fb6-79bec-d67fac0"
Accept-Ranges: bytes
Content-Length: 4303
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: multipart/byteranges; boundary=49980f01bf1635062

--49980f01bf1635062
Content-type: application/pdf
Content-range: bytes 1-1/498668

P
--49980f01bf1635062
Content-type: application/pdf
Content-range: bytes 0-4095/498668

%PDF-1.4
...

In this case you see the client asking for and the server responding with the second byte of the PDF, then the first 4K of it.

For brevity’s sake, I’ll only display the 206defrag “output” log:

Jan 10 12:04:02 master 206defrag: output tcp-170962418-1294678989-1294679016-c-233988-10.45.179.94:19950c129.174.93.170:80_http-0-part-1_206defrag normal 70 498668 94046a5fb1c5802d0f1e6d704cf3e10e tcp-170962418-1294678989-1294679016-c-233988-10.45.179.94:19950c129.174.93.170:80_http-0-part-1,tcp-170962418-1294678989-1294679016-c-233988-10.45.179.94:19950c129.174.93.170:80_http-1-part-1,tcp-170962841-1294678990-1294679016-c-305932-10.45.179.94:19953c129.174.93.170:80_http-1-part-4,tcp-170962418-1294678989-1294679016-c-233988-10.45.179.94:19950c129.174.93.170:80_http-6-part-1,tcp-170962418-1294678989-1294679016-c-233988-10.45.179.94:19950c129.174.93.170:80_http-7-part-2,tcp-170962841-1294678990-1294679016-c-305932-10.45.179.94:19953c129.174.93.170:80_http-2-part-1,...

In case you’re curious, yes the “70” early in the log means that the payload was assembled from 70 fragments. Furthermore, the “normal” means that the fragments that were spliced together from contiguous segments without any portions of the fragments overlapping. Note that the duplication of byte 1 numerous times doesn’t affect this because it’s not necessary to use those fragments. In the future, I could be more granular with the logic and logging for special cases where fragments are duplicated, fragments overlap, etc. I have little knowledge of how specific HTTP clients handle situations like overlapping fragments.

One other thing of note is that these fragments are being transferred through two simultaneous TCP connections (client port 19950 and 19953) using multiple HTTP 1.1 transactions. One other thing that I think is interesting about this example is the seemingly sporadic order in which the fragments are requested:

The following shows the client TCP port, the HTTP transaction index in that TCP connection, the, the MIME part index, the fragment start index, and the fragment length.

[csmutz@master 206_examples]$ cat http_c.log | grep input | sed -r 's/tcp-.*:([0-9]+)c.*-([0-9]+-part-[0-9]+) /\1.\2 /' | awk '{ print $7" "$9" "$10 }'
19953.0-part-0 1 1
19950.0-part-0 1 1
19953.0-part-1 487541 4096
19950.0-part-1 0 4096
19953.1-part-0 1 1
19950.1-part-0 1 1
19950.1-part-1 4096 14319
19953.1-part-1 478933 1325
19953.1-part-2 477152 1781
19950.2-part-0 1 1
19953.1-part-3 480258 803
19953.1-part-4 18415 2540
19950.2-part-1 494520 4096
19953.1-part-5 481061 697
19950.3-part-0 1 1
19953.2-part-0 1 1
19953.2-part-1 32255 13312
19950.3-part-1 498616 52
19953.3-part-0 1 1
19950.4-part-0 1 1
19953.3-part-1 52049 5315
19953.3-part-2 483154 1646
19950.4-part-1 491637 2883
19953.3-part-3 57364 5529
19953.3-part-4 485870 46
...

I’m not sure I can discern any pattern to the manner in which the fragments are transferred, but it’s definitely not in order. While this looks like a bit of a shotgun (double barreled in this case) approach to getting this file, it’s not overly haphazard as the fragments line up nicely. I did quickly look at the byteranges themselves to see if they correlated to the internal structure of the PDF (objects/streams) but didn’t see anything too obvious in the couple I examined. I’m also not sure why the client wants to request the second byte so frequently. According to my reckoning, the payload was reconstructed from 70 fragments, using 22 HTTP transactions, through 2 unique TCP connections. While definitely the exception rather than the norm, this is an example where the buffer then analyze model of Ruminate has significant benefits over the stateful incremental analysis model of conventional packet based NIDS.

While examples of rare conditions, examples B and C demonstrate the type of issues I’ve built Ruminate to be able to study and address. As attacks continue to move up the stack, NIDS research needs to also.

Descending out of the clouds into the real world, example A isn’t as uncommon as many might suppose. I’m hoping that the upcoming release of Ruminate, with vastly improved HTTP parsing capabilities, will prove useful to some in operational environments. I feel it important to reiterate that Ruminate is a research oriented tool--it’s somewhere between experimental and proof of concept. The last thing I want is for Ruminate to be used in manner that someone is misled with a false sense of security. It should go without saying, but only those who are willing to accept any limitations (presumably without knowing all of them) or are willing to do adequate vetting themselves should rely on Ruminate in production environments. That being said, I’ve been pleasantly surprised with what I’ve been able to do with Ruminate so far.

In the next couple weeks I’m going to work on refining, packaging, and documenting Ruminate so it will easier for those who want to try to play with it. I hope to have this done around the end of the month.

Saturday, January 1, 2011

5 Saddest Conspiracy Theories of 2010

Is it not obligatory for bloggers to make some sort of list at New Year? Well here is mine. I’m posting what I call the saddest conspiracy theories of 2010. These are all events that are clouded by secrecy and/or controversy, implying some amount of foul play or reckless incompetency. While all are somehow related to security or technology, some are on the periphery of the topics normally discussed in this blog. I’ll only give sensational one-sided coverage for these conspiracy theories. While I won’t even try to argue the “truth” of any of these, what makes them sad is that the level of plausibility is much higher than zero.

1. Another US Gov Sponsored Backdoor


The FBI has been accused of trying to put backdoors into the IPSEC implementation of OpenBSD. It appears, at least to the founder and leader of OpenBSD, that the FBI did contract people to modify OpenBSD for the purpose of introducing bugs. However, it’s unclear if intended audience for these bugs was the whole world (unlikely), organizations with specific hardware, or just an internal experiment. I’d be receptive to the experiment explanation if it was it was done openly (like my dabbling in breaking forward secrecy through OS level random escrow) or to the experiment explanation if it never touched the internet. The commits to a public project are kind of scary. The jury is still out on this one. However, if this turns out anything like the alleged NSA backdoor in the Windows PRNG, we won’t hear much more conclusive on this. The sad part is the community isn’t wondering if the three letter agencies are trustworthy participants in the design and implementation of crypto. The answer is clear: No. The real question is how many more of these are lingering both in open and closed source software.

2. Security Theater Turns Peep Show


Yes, I had to include it. The security theater that is TSA screening at airports was bad enough in the past. It has provided basically no improvement in security, has amplified the effects of terrorism, and has been an unjustified encroachment on civil liberties. This year sees the widespread deployment of X-ray backscatter machines, also known as full body scanners. The public backlash is heating up. While there’s plenty of controversy, and probably not a lot of conspiracy, the current state of airport security is just plain sad. Let’s hope we can find a way to apply the same logic and tactics which are being used so effectively for “real world” security to the field of cyber security.

3. Big Brother Breathes New Life Into Wiretapping Laws


Up until a few years ago, most people thought wiretapping laws were in place to prevent people from being covertly spied on by others, especially police and spooks that are wont to do things like warrantless wiretapping.Those of us who questioned the purpose of these wiretapping laws (or the constitution for that matter) back in 2007-2009 time frame, now have some consolation. In 2010, it has become common practice for police to use local and state wiretapping laws to retaliate against people who try to hold them accountable though recording of police in public settings. With a little luck and even more creative interpretation of laws, even the federal wiretapping laws may be useful in the future.

4. Traditional Journalism: Too Big to Fail


While I don’t want to delve in to the whole Wikileaks affair, one thing I’ve seen coming out of it is a lot of criticism of Wikileaks. Most of the criticism from the media seems rooted more in desires at maintaining their traditional role in filtering, pushing, and disseminating news than ensuring important news is uncovered and the public is informed. For example, when Floyd Abrams discusses Why WikiLeaks Is Unlike the Pentagon Papers he focuses more on the narrow topic of why wikileaks is a threat to traditional journalism instead of more fundamental topics like freedom of press or government accountability. To me it seems that the very wiki model is being attacked, not because it’s inherently wrong, but because it continues to marginalize the role of established information channels. The writing is on the wall that traditional news “sources” are an endangered species so they’re in survival mode. It seems that they are often more worried about fighting turf wars and ingratiating themselves with The Man than serving their more fundamental role of public watchdog. It really doesn’t matter where you fall on the professional vs. crowdsource information flow argument, when media is more worried about getting and maintaining government support than fulfilling their core mission, we ought to be scared. Don’t worry though, the next iteration of wikileaks, openleaks, is going to put the traditional media folk back into the loop.


5. US-China Diplomacy vis-à-vis Intellectual Property


So of all the conspiracy theories, this is the 800 pound panda. While many are still waking up to it, the ever widening scope of cyber espionage being conducted by targeted, persistent attackers is alarming. Many open sources, including Google, attribute these attacks to actors in China—-with largely unsupported and varying claims about the level of the Chinese Government’s involvement. The US should be pursuing diplomatic solutions to this problem, the economic portion of which has been aptly seen “as a trade issue that we have not dealt with.” So Hillary Clinton says with big words that China should investigate and the American people will be updated as the “facts become clear”. What have we heard so far on cyber espionage front? Not much. That’s OK though because the US has been very active this year in other tough diplomatic discussions with China. For example, Attorny General Holder visited China late this year to discuss intellectual property rights. Apparently, China promised to crack down on illegal distribution of music, movies, and software.

What a big win. First of all, we wouldn’t want to go lax on software piracy enforcement, especially not in light of recent extensive abuse by oppressive regimes. The problem is so bad that Microsoft, one of the most draconian companies when it comes to software piracy and one of the most permissive when it comes to “local” law (like search result filtering), recently extended free licenses to the type of organizations where unequal software piracy enforcement is used as a pretext for oppressing dissidents. I can definitely see how the relatively extreme punishments imposed on the relatively few people actually caught pirating music and videos in the US would fit well with the Chinese model of law enforcement. Not only that, but this could help fill in some of the pretext for abuse taken away by liberal software licensing. Best yet, continued discussions like this could lay the ground work for expansion of intellectual property protection even other western countries refuse to get caught up in. For example, wouldn’t it be great if software patents, one of the US’s greatest forms of meta-innovation of late, were enforced with the same vigor and uniformity in China as they are in the US?

Whether you feel like getting out your tinfoil hat or your tissue to catch your tears, I hope these critical reflections on 2010 have been amusing, even comical. Let’s all hope for better in 2011.