Wednesday, June 23, 2010

Flushing out Leaky Taps

Updated 03/17/2012: This article is now deprecated. Please see the revamped version: http://smusec.blogspot.com/2012/03/flushing-out-leaky-taps-v2.html.

Many organizations rely heavily on their network monitoring tools. Network monitoring tools that operate on passive taps are often assumed to have complete network visibility. While most network monitoring tools provide stats on the packets dropped internally, most don’t tell you how many packets were lost externally to the appliance. I suspect that very few organizations do an in depth verification of the completeness of tapped data nor quantify the amount of loss that occurs in their tapping infrastructure before packets arrive at network monitoring tools. Since I’ve seen very little discussion on the topic, this post will focus on techniques and tools for detecting and measuring tapping issues.

Impact of Leaky Taps


How many packets does your tapping infrastructure drop before ever reaching your network monitoring devices? How do you know?

I’ve seen too many environments where tapping problems have caused network monitoring tools to provide incorrect or incomplete results. Often these issues last for months or years without being discovered, if ever. Making decisions or relying on bad data is never good. Many public packet traces also include the type of visibility issues I will discuss.

One thing to keep in mind when worrying about loss due to tapping is that you should probably solve, or at least quantify, any packet loss inside your network monitoring devices before you worry about packet loss in the taps. You need to have strong confidence in the accuracy of your network monitoring devices before you use data from them to debug loss by your taps. Remember, in most network monitoring systems there are multiple places where packet loss is reported. For example, using tcpdump on Linux, you have the dropped packets reported by tcpdump and the packets dropped by the network interface (ifconfig).

I’m not going to discuss in detail the many things that can go wrong in getting packets from your network to a network monitoring tool. For a quick overview on different strategies for tapping, I’d recommend this article by the argus guys. I will focus largely on the resulting symptoms and how to detect, and to some degree, quantify them. I’m going to focus on two very common cases: low volume packet loss and unidirectional (simplex) visibility.

Low volume packet loss is common in many tapping infrastructures, from span ports up to high end regenerative tapping devices. I feel that many people wrongly assume that taps either work 100% or not at all. In practice, it is common for tapping infrastructures to drop some packets such that your network monitoring device never even gets the chance to inspect them. Many public packet traces include this type of loss. Very often this loss isn’t even recognized, let alone quantified.

The impact of this loss depends on what you are trying to do. If you are collecting netflow, then the impact probably isn’t too bad since you’re looking at summaries anyway. You’ll have slightly incorrect packet and byte counts, but overall the impact is going to be small. Since most flows contain many packets, totally missing a flow is unlikely. If you’re doing signature matching IDS, such as snort, then the impact is probably very small, unless you win the lottery and the packet dropped by your taps is the one containing the attack you want to detect. Again, stats are in your favor here. Most packet based IDSs are pretty tolerant of packet loss. However, if you are doing comprehensive deep payload analysis, the impact can be pretty severe. Let’s say you have a system that collects and/or analyzes all payload objects of certain type--it could be anything from emails to multi-media files. If you loose just one packet used to transfer part of the payload object, you can impact your ability to effectively analyze that payload object. If you have to ignore or discard the whole payload object, the impact of a single lost packet can be significantly multiplied in that many packets worth of data can’t be analyzed.

Another common problem is unidirectional visibility. There are sites and organizations that do asymmetric routing such they actually intend to tap and monitor unidirectional flows. Obviously, this discussion only applies to situations where one intends to tap a bi-directional link but only ends up analyzing one direction. One notorious example of a public data set suffering from this issue is the 2009 Inter-Service Academy Cyber Defense Competition.

Unidirectional capture is common, for example, when using regenerative taps which split tapped traffic into two links based on direction but only one directional link makes it into the monitoring device. Most netflow systems are actually designed to operate well on simplex links so the adverse affect is that you only get data on one direction. Simple packet based inspection works fine, but more advanced, and usually rare, rules or operations using both directions obviously won’t work. Multi-packet payload inspection may still be possible on the visible direction, but it often requires severe assumptions to be made about reassembly, opening the door to classic IDS evasion. As such, some deep payload analysis systems, including vortex and others based on libnids, just won’t work on unidirectional data. Simplex visibility is usually pretty easy to detect and deal with, but it often goes undetected because most networking monitoring equipment functions well without full duplex data.

External Verification


Probably the best strategy for verifying network tapping infrastructure is to perform some sort of comparison of data collected passively with data collected inline. This could be comparing packet counts on routers or end devices to packet counts on a network monitoring device. For higher order verification, you should do something like compare higher order network transaction logs from an inline or end device against passively collected transaction logs. For example, you could compare IIS or Apache webserver logs to HTTP transaction logs collected by an IDS such as Bro or Suricata. These verification techniques are often difficult. You’ve got to try to deal with issues such as clock synchronization and offsets (caused by buffers in tapping infrastructure or IDS devices), differences in the data sources/logs used for concordance, etc. This is not trivial, but often can be done.

Usually the biggest barrier to external verification of tapping infrastructure is the lack of any comprehensive external data source. Many people rely on passive collection devices for their primary and authoritative network monitoring. Often times, there just isn’t another data source to which you can compare your passive network monitoring tools.

One tactic I’ve used to prove loss in taps is to use two sets of taps such that packets must traverse both taps. If one tap sees a packet traverse the network and another tap doesn’t, and both monitoring tools claim 0 packet loss, you know you’ve got a problem. I’ve actually seen situations where one network monitoring device didn’t see some packets and the other network monitoring devices didn’t see some packets, but the missing packets from the two traces didn’t overlap.

Inferring Tapping Issues


While not easy and necessarily not as precise nor as complete as comparing to external data, using network monitoring tools to infer visibility gaps in the data they are seeing is possible. Many network protocols, namely TCP, provide mechanisms specifically designed to ensure reliable transport of data. Unlike an endpoint, a passive observer can’t simply ask for a retransmission when a packet is lost. However, a passive observer can use the mechanisms the endpoints use to infer if it missed packets passed between endpoints. For example, if Alice sends a packet to Bob which the passive observer Eve doesn’t see, but Bob acknowledges receipt with Alice and Eve sees the acknowledgement, Eve can infer that she missed a packet.

Data and Tools


To keep the examples simple and easily comparable, I’ve created 3 pcaps. The full pcap contains all the packets from a HTTP download of the ASCII “Alice in Wonderland” from Project Gutenburg. The loss pcap, is the same except that one packet, packet 50, was removed. The half pcap is the same as the full pcap, but only contains the packets going to the server, without the packets going to the client.

For tools, I’ll be using argus and tshark to infer packet loss in the tap. Argus is a network flow monitoring tool. Tshark is the CLI version of the ever popular wireshark. Since deep payload analysis systems are often greatly affected by packet loss, I’ll explain how the two types of packet loss affect vortex.

Low Volume Loss in Taps


Detecting and quantifying low volume loss can be difficult. The most effective tool I’ve found for measuring this is tshark, especially the tcp analysis lost segment flag.

Note that this easily identifies the lost packet at postion 50:


$ tshark -r alice_full.pcap -R tcp.analysis.lost_segment
$ tshark -r alice_loss.pcap -R tcp.analysis.lost_segment
50 0.410502 152.46.7.81 -> 66.173.221.158 TCP [TCP Previous segment lost] [TCP segment of a reassembled PDU]


I’ve created a simple (but inefficient) script that can be used on many pcaps. Since tshark doesn’t release memory, you’ll need to use pcap slices smaller than the amount of memory in your system. The script is as follows:


#!/bin/bash

while read file
do
total=`tcpdump -r $file -nn "tcp" 2>/dev/null | wc -l`
errors=`tshark -r $file -R tcp.analysis.lost_segment | wc -l`
percent=`echo $errors $total | awk '{ print $1*100/$2 }'`
bandwidth=`capinfos $file | grep "bits/s" | awk '{ print $3" "$4 }'`
echo "$file: $percent% $bandwidth "
done


Updated 02/21/2011: Most people will want to use "tcp.analysis.ack_lost_segment" instead of "tcp.analysis.lost_segment". See bottom of post for details.

It is operated by piping it a list of pcap files. For example, here are the results from the slices of the defcon17 Capture the Flag packet captures:


$ ls ctf_dc17.pcap0* | calc_tcp_packet_loss.sh
ctf_dc17.pcap000: 0.44235% 34751.40 bits/s
ctf_dc17.pcap001: 0.584816% 210957.26 bits/s
ctf_dc17.pcap002: 0.615856% 173889.57 bits/s
ctf_dc17.pcap003: 0.51238% 165425.21 bits/s
ctf_dc17.pcap004: 0.343817% 253283.86 bits/s
...


Note that I haven’t done any sort of serious analysis of this data set. I assume there were some packets lost, but don’t know for sure. I’m just inferring. Also, assuming there are some packets missing, I will never know if this was a tapping issue, network monitoring/packet capture issue, or both.

In the case of low volume loss in taps, netflow isn’t always the most useful.


$ argus -X -r alice_full.pcap -w full.argus
$ ra -r full.argus -n -s stime flgs saddr sport daddr dport spkts dpkts loss
10:12:54.474330 e 66.173.221.158.55812 152.46.7.81.80 87 121 0
$ argus -X -r alice_loss.pcap -w loss.argus
$ ra -r loss.argus -n -s stime flgs saddr sport daddr dport spkts dpkts loss
10:12:54.474330 e 66.173.221.158.55812 152.46.7.81.80 87 120 0


Note that there is one less dpkt (destination packet). Other than the packet counts, there is no way to know that packet loss occurred. I’d swear I’ve seen other cases where argus actually gave an indication of packet loss in either the loss count or the flags, but that’s definitely not occurring here. Note loss in most network flow monitoring tools refers to packets lost by the network itself (observed by retransmission) not loss in the taps which has to be inferred.

Vortex basically gives up on trying to reassemble a TCP stream if there is a packet that is lost and the TCP window is exceeded. The stream gets truncated at the first hole and the stream remains in limbo until it idles out or vortex closes.


$ vortex -r alice_full.pcap -e -t full
Couldn't set capture thread priority!
full/tcp-1-1276956774-1276956775-c-168169-66.173.221.158:55812s152.46.7.81:80
full/tcp-1-1276956774-1276956775-c-168169-66.173.221.158:55812c152.46.7.81:80
VORTEX_ERRORS TOTAL: 0 IP_SIZE: 0 IP_FRAG: 0 IP_HDR: 0 IP_SRCRT: 0 TCP_LIMIT: 0 TCP_HDR: 0 TCP_QUE: 0 TCP_FLAGS: 0 UDP_ALL: 0 SCAN_ALL: 0 VTX_RING: 0 OTHER: 0
VORTEX_STATS PCAP_RECV: 0 PCAP_DROP: 0 VTX_BYTES: 168169 VTX_EST: 1 VTX_WAIT: 0 VTX_CLOSE_TOT: 1 VTX_CLOSE: 1 VTX_LIMIT: 0 VTX_POLL: 0 VTX_TIMOUT: 0 VTX_IDLE: 0 VTX_RST: 0 VTX_EXIT: 0 VTX_BSF: 0

$ vortex -r alice_loss.pcap -e -t loss
Couldn't set capture thread priority!
loss/tcp-1-1276956774-1276956774-e-31056-66.173.221.158:55812s152.46.7.81:80
loss/tcp-1-1276956774-1276956774-e-31056-66.173.221.158:55812c152.46.7.81:80
VORTEX_ERRORS TOTAL: 2 IP_SIZE: 0 IP_FRAG: 0 IP_HDR: 0 IP_SRCRT: 0 TCP_LIMIT: 0 TCP_HDR: 0 TCP_QUE: 2 TCP_FLAGS: 0 UDP_ALL: 0 SCAN_ALL: 0 VTX_RING: 0 OTHER: 0
Hint--TCP_QUEUE: Investigate possible packet loss (if PCAP_LOSS is 0 check ifconfig for RX dropped).
VORTEX_STATS PCAP_RECV: 0 PCAP_DROP: 0 VTX_BYTES: 31056 VTX_EST: 1 VTX_WAIT: 0 VTX_CLOSE_TOT: 1 VTX_CLOSE: 0 VTX_LIMIT: 0 VTX_POLL: 0 VTX_TIMOUT: 0 VTX_IDLE: 0 VTX_RST: 0 VTX_EXIT: 1 VTX_BSF: 0


Note that there are fewer bytes collected, vortex warns about packet loss, there are TCP_QUEUE errors, and the stream doesn’t close cleanly in the loss pcap.

Simplex Capture


Simplex Capture is actually pretty simple to identify. It’s only problematic because many tools don’t warn you if it is occurring, so you often don’t even know it is happening. The straightforward approach is to use netflow and look for flows with packets in only one direction.


$ argus -X -r alice_half.pcap -w half.argus
$ ra -r half.argus -n -s stime flgs saddr sport daddr dport spkts dpkts loss
10:12:54.474330 e 66.173.221.158.55812 152.46.7.81.80 87 0 0


This couldn’t be more clear. There are only packets in one direction. If you use a really small flow record interval, you’ll want to do some flow aggregation to ensure you will get packets from both directions in a given flow record. Note that argus by default creates bidirectional flow records. If your netflow system does unidirectional flow records, you need to do a little more work like associating the two unidirectional flows and making sure both sides exist.

You could also use tshark or tcpdump and see that for a given connection, you only see packets in one direction.

Vortex handles simplex network traffic in a straightforward, albeit somewhat lackluster manner--it just ignores it. LibNIDS, on which vortex is based, is designed to overcome NIDS TCP evasion techniques through exactly mirroring the functionality of TCP stack but assumes full visibility (no packet loss) to do so. If it doesn’t see both sides of a TCP handshake, it won’t follow the stream because a full handshake hasn’t occurred. As such the use of vortex on the half pcap is rather uneventful:


$ vortex -r alice_half.pcap -e -t half
Couldn't set capture thread priority!
VORTEX_ERRORS TOTAL: 0 IP_SIZE: 0 IP_FRAG: 0 IP_HDR: 0 IP_SRCRT: 0 TCP_LIMIT: 0 TCP_HDR: 0 TCP_QUE: 0 TCP_FLAGS: 0 UDP_ALL: 0 SCAN_ALL: 0 VTX_RING: 0 OTHER: 0
VORTEX_STATS PCAP_RECV: 0 PCAP_DROP: 0 VTX_BYTES: 0 VTX_EST: 0 VTX_WAIT: 0 VTX_CLOSE_TOT: 0 VTX_CLOSE: 0 VTX_LIMIT: 0 VTX_POLL: 0 VTX_TIMOUT: 0 VTX_IDLE: 0 VTX_RST: 0 VTX_EXIT: 0 VTX_BSF: 0


The most optimistic observer will point out that at least vortex makes it clear when you don’t have full duplex traffic--because you see nothing.

Conclusion


I hope the above is helpful to others who rely on passive network monitoring tools. I’ve discussed the two most prevalent tapping issues I’ve seen personally. One topic I’ve intentionally avoided because it’s hard to discuss and debug is interleaving of aggregated taps, especially issues with timing. For example, assume you do some amount of tap aggregation, especially aggregation of simplex flows, either using an external tap aggregator or bonded interfaces inside your network monitoring system. If enough buffering occurs, it may be possible for packets from each simplex flow to be interleaved incorrectly. For example, a SYN-ACK, may end up in front of the corresponding SYN. There are other subtle tapping issues, but the two I discussed above are by far the most prevalent problems I’ve seen. Verifying or quantifying the loss in your tapping infrastructure once is above and beyond what many organizations do. If you rely heavily on the validity of your data, you may consider doing this periodically or automatically so you detect any changes or failures.


Updated 02/21/2011: I need to clarify and correct the discussion about low volume packet loss. The point of this post was to talk about packet loss in tapping infrastructure--packets that are successfully transferred through network, but which don’t make it to passive monitoring equipment. This is actually pretty common in low end tapping equipment, such as span ports of switches or routers. My intention was not to talk about normal packet loss that occurs in networks, usually due to network congestion. I messed up. I have two versions of the below script floating around--one that measures packets “missed” by the network monitor and one that measures total packets “lost” on the network. I used the wrong one above.

Let me explain more. When I say “missed” I mean packets that traversed the network being monitored, but didn’t make it to the monitor device. Ex. they were lost during tapping/capture. When I say “lost” packets, I mean packets that the monitor device anticipated, but didn’t see for whatever reason. They could be dropped on the network (i.e. congestion) or could be dropped in the tapping/capture process. One really cool feature of tshark is that you can easily differentiate between the two. The tcp.analysis.ack_lost_segment filter matches all packets which ACK a packet (or packets) which were not seen in the packet trace. The official description is: “ACKed Lost Packet (This frame ACKs a lost segment)”. While your monitor device didn’t see the ACK’d packets, the other endpoint in the communications presumably did because it sent an ACK. The implications of this are that you can infer with strong confidence that the absent packets were actually transferred through the network but were “missed” by your capture. This feature of tshark is the best way I’ve found to identify packet loss that is occurring in passive network tapping devices or in network monitors which isn’t reported in the normal places in network sensors (pcap dropped, ifconfig dropped, ethtool -S). In normal networks with properly functioning passive monitoring devices “ack_lost_segment” should be zero.

On the other hand, the mechanism which I mistakenly demonstrated below calculates packets lost for any reason, usually either congestion on the network being monitored or deficiencies in networking monitoring equipment. The description of tcp.analysis.lost_segment is: “Previous Segment Lost (A segment before this one was lost from the capture). For the purposes of verifying the accuracy of your network monitoring equipment, any loss due to congestion is a red herring. While this mechanism certainly does report packets “missed” by your network monitoring equipment, it will also report those “lost” for any other reason. I keep this version of the script around to look at things like loss due to congestion. It may well be useful for passively studying where loss due to congestion is occurring such as you might do if you are studying buffer bloat. In networks subject to normal congestion, “lost_segment” should be non-zero.

Please excuse this mistake. I try hard to keep my technical blog posts strictly correct, very often providing real examples.




Updated 07/07/2011: György Szaniszló has proposed a fix for wireshark that ensures that all “ack_lost_segment” are actually reported as such. In older versions of tshark, there were false negatives (instances where “ack_lost_segment” should have been reported but wasn't) but no false postivies (all instances of the “ack_lost_segment” were correct). As such, with György's fix, tshark should provide more accurate numbers in the event of loss in tapping infrastructure. The old versions of tshark are still useful for confirming that you have problems with your tapping infrastructure (I've had decent success with them), but clearly are not as accurate for comprehensively quantifying all instances of loss in your taps. In his bug report, he does a great job explaining the different types of loss, which he terms “Network-side” and “Monitor-side”. He also provides an additional trace for testing.

5 comments:

  1. This is a great article, thanks!

    I spend a lot of time auditing the numbers along the entire entry/exit chain. Not only for the reasons you mentioned, but also because I can find a lot of issues that the networking guys are completely missing - like why are certain packets going through a device 4 times before leaving the network? Or traffic that is routing in the front door and out a backdoor - and vice versa. Since we're the only guys that usually do these types of audits, I think it's important to expand the audit beyond just security concerns and help clean-up network and system configuration issues.

    Cheers

    ReplyDelete
  2. nice post...will be useful to many.

    ReplyDelete
  3. Hi Charles,

    Very nice and useful article.

    When I tried to use it using it in practice, I have found a bug in the Wireshark, it did not calculate the tcp.analysis.ack_lost_segment correctly. I have submitted a patch to WireShark, to correct this bug.

    https://bugs.wireshark.org/bugzilla/show_bug.cgi?id=6081

    In the future, you can use both tcp.analysis.lost_segment (it fires for almost all kind of packet loss) and tcp.analysis.ack_lost_segment (it fires only for monitoring-side packet loss)

    Best Regards,
    Gyorgy
    gyorgy.szaniszlo@ericsson.com

    ReplyDelete
  4. Gyorgy,

    Your debugging, examples, and proposed fix are excellent. I know I'll benefit from this. Thanks for sharing.

    Charles

    ReplyDelete