Monday, August 5, 2013

Intelligence Compliance: Checklist Mentality For Cyber Intelligence

The past few years have seen a sharp increase in the amount of targeted attack intelligence shared and the number of venues used for sharing. There is a geeky fervor emerging in the community rooted in the premise that if we accelerate sharing and consumption of intelligence, even to the point of automation, we could significantly improve network defense. Momentum is snowballing for adoption of specific standards for intel sharing, the foremost of which is the mitre suite of STIX/TAXII/MAEC. There is a seemingly self-evident necessity to share more intel, share it wider, and share it faster.

As someone who seeks to apply technology to incident response, I see great promise in standards, technology, and investments to accelerate the distribution and application of threat intelligence. I’ve spent years developing capabilities to perform security intelligence and have seen first-hand the benefits of information sharing. The amount of security data exchanged will likely continue to grow. However, I have concerns about the shift from security intelligence to intelligence compliance and about the fundamental benefit of comprehensive distribution of threat intelligence.

Compliance Supplanting Analysis

Information sharing has been critical to the success of intelligence based network defense. Direct competitors from various industries have come together to battle the common problem of espionage threats. Threat intelligence sharing has been wildly successful as sharing has included relevant context, has been timely, and as the attackers have been common to those sharing. Years ago, much of this sharing was informal, based primarily in direct analyst to analyst collaboration. Over time, formal intel sharing arrangements have evolved and are proliferating today, increasing in count of sharing venues, the number of participants, and the volume of intel.

The primary concern I have with this increase in intel sharing is that it is often accompanied by a compliance mindset. If there’s anything we should have learned from targeted attacks, it is that compliance based approaches will not stop highly motivated attacks. It’s inevitable that conformance will fail, given enough attacker effort. For example, targeted attackers frequently have access to zero-day exploits that aren’t even on the red/yellow/green vulnerability metric charts, let alone affected by our progress in speeding patching from months to days. The reactive approach to incident response is focused primarily on preventing known attacks. As a community, we have developed intelligence driven network defense to remedy this situation. It allows us to get ahead of individual attacks by keeping tabs on persistent attackers, styled campaigns in the vernacular, in addition to a proper vulnerability focused approaches. The beauty of intelligence driven incident response is that it gives some degree of assurance that if you have good intelligence on an attack group, you will detect/mitigate subsequent attacks if they maintain some of the patterns they have exhibited in the past. This may seem like a limited guarantee, and indeed it is narrow, but it’s the most effective way to defeat APT. Intelligence compliance, on the other hand, promises completeness in dispatching with all documented and shared attacks, but it makes no promise for previously unknown attacks.

To explain in detail, the point of kill chain and associated analysis isn’t merely to extract a proscribed set of data to be used as mitigation fodder, but to identify multiple durable indicators to form a campaign. This has been succinctly explained by my colleague, Mike Cloppert, in his blog post on defining campaigns. The persistence of these indicators serves not only as the method of aggregating individual attacks into campaigns, but the presence of these consistencies is the substance of the assurance that you can reliably defeat a given campaign. By definition, individual attacks from the same campaign have persistent attributes. If you have a few of these, 3 seems to be the magic number, across multiple phases of the kill chain, you have good assurance that you will mitigate subsequent attacks, even if one of these previously static attributes change. If you can’t identify or your IDS can’t detect enough of these attributes, security intelligence dictates that you either dig deeper to identify these and/or upgrade your detection mechanisms to support these durable indicators. Ergo, defense driven by intelligence dictates you do analysis to identify persistent attack attributes and build your counter-measures around these.

Intelligence compliance, on the other hand, provides no similar rational basis for preparation against previously unseen attacks. Surely, a compliance focused approach has some appeal. It is often viewed as seeking to ensure consistency in an activity that is highly valuable for security intelligence. In other cases, less critical drivers overshadow primary mission success. Intel compliance provides a highly structured process that can be easily metered—very important in bureaucratic environments. The one guarantee that intelligence compliance does give is that you have the same defenses, at least those based on common intelligence, as everyone else in your sharing venue. This is important in when covering your bases is the primary driver. This provides no guarantee about new attacks or the actual security of your data, but does allow you to ensure that known attacks are mitigated, which is arguably most important for many forms of liability. Lastly, giving, taking, or middle manning intel can all be used as chips in political games ranging from true synergies to contrived intrigues. Intelligence compliance provides a repeatable and measurable process which caters to managerial ease and is also able to be aligned with legal and political initiatives.

There are limitless ways in which intelligence compliance can go wrong. Most failings can be categorized as either supplanting more effective activities or as a shortcoming in the mode of intelligence sharing which reduces its value. It is also fair to question if perfect and universal intelligence compliance would even be effective. Remember, intelligence compliance usually isn’t adopted on technical merits. The best, if not naïve, argument for adoption of this compliance mindset is that intel sharing has been useful for security intelligence, hence, by extrapolation, increasing threat data sharing must be better. Sadly, the rationale for a compliance approach to intel sharing frequently digresses to placing short-sighted blame avoidance in front of long term results.

The primary way in which intel compliance goes awry is when it displaces the capacity for security intelligence. If analysts spend more time on superfluous processing of external information than doing actual intelligence analysis, you’ve got serious means/ends inversion problems. Unfortunately, it’s often easier to process another batch of external intel verses digging deeper on a campaign to discover resilient indicators. This can be compared to egregious failings in software vulnerability remediation where more resources are spent on digital paper pushing, such as layers of statistics on compliance rates or elaborate exception justification and approval documentation, than is expended actually patching systems. An important observation is that the most easily shared and mitigated indicators, say IP addresses (firewalls) or email addresses (email filters), are also easily modified by attackers. For that reason, some of the most commonly exchanged indicators are also the most ephemeral, although this does depend on the specific attack group. If an indicator isn’t reused by an attacker, then sharing is useful for detecting previous attacks (hopefully before severe impact) but doesn’t prevent new attacks. A focus on short-lived intel can result in a whack-a-mole cycle that saps resources cleaning up compromises. This vicious cycle is taken to a meta level when human intensive processes are used despite increased intel sharing volume, putting organizations too far behind to make technological and process improvements that would facilitate higher intel processing efficiency. This plays into the attacker’s hand. This is exactly the scenario that security intelligence, including kill chain analysis, disrupts--allowing defenders to turn an attacker’s persistence into a defensive advantage.

Another intel sharing tragedy occurs when attack data is exchanged as though it is actionable intelligence and yet it’s no more than raw attack data. There are many who would advocate extracting a consistent set of metadata from attacks and regurgitating that as intelligence for sharing to their peers. I’m all for sharing raw attack data, but it must be analyzed to produce useful intelligence. If the original source doesn’t do any vetting of the attack data and shares a condensed subset, then the receiver will be forced to vet the data to see if it’s useful for countermeasures, but often with less context. The canonical example which illustrates the difference between raw attack data and intelligence is the IP address of an email server that sent a targeted malicious email, where that server is part of a common webmail service. Clearly this is valid attack data, but it’s about as specific to the targeted attacker as the license plate number of a vehicle the terrorist once rode in is to that terrorist, given that vehicle was a public bus. Some part of this vetting can and should be automated, but effective human adjudication is usually still necessary. Furthermore, many of the centralized clearinghouses don’t have the requisite data to adequately vet these indicators, so they are happily brokered to all consumers. To make matters more difficult, different organizations are willing to accept different types of collateral mitigation, the easiest solution being to devolve into the common denominator for the community which is a subset of the actionable intelligence for each organization. For example, given an otherwise legitimate website that is temporarily compromised to host malicious content, some members of a community may be able to block the whole website while others may not be able to accept the business impact. The easiest solution is for the community is to reject the overly broad mitigation causing collateral impact, while the optimal use of the intelligence requires risk assessment by each organization.

While ambiguity between raw attack data and vetted intelligence is the most obscene operationally, because it can result blocking benign user activity, there are other issues related to incomplete context on so called intelligence. An important aspect of security intelligence is proper prioritization. For example, many defenders invest significantly more analyst time in espionage threats while rapidly dispatching with commodity crimeware. If this context is not provided, improper triage might results in wasted resources. Ideally, this would include not only a categorization of the class of threat, but the actual threat identity, i.e. campaign name. Similarly, often intelligence is devoid of useful context such as whether the IP address reported is used as the source of attacks against vulnerable web servers, for email delivery, or for command and control. This can lead to imprecise or inappropriate countermeasures. Poorly vetted or ambiguous intel is analogous to (and sometimes turns into) the noisy signatures in your IDS—they waste time and desensitize.

With all that being said, I’m an advocate of intel sharing. I’m an advocate of sharing raw attack data, which is useful for those with the capacity to extract the useful intelligence. Realizing that this isn’t scalable, I’m also an advocate of sharing well vetted intelligence, with the requisite context to be actionable. However, even if your shop doesn’t have the ability to process raw attack data at a high volume, sharing that data with those who can ostensibly results in sharing of intel back to you that you couldn’t synthesize yourself. My main concern with intelligence compliance is that it robs time and resources from security intelligence while providing no guarantee of efficacy.

Intelligence Race to Zero

Beyond supplanting security intelligence, my other concern with the increase in information sharing is that as we become more proficient at ubiquitous sharing, the value of the intelligence will be diminished. This will occur whether the intelligence is revealed directly to the attackers or if lack of success causes them to evolve. Either way, I question if intelligence applied universally for counter-measures can ever be truly effective. Almost all current intelligence sharing venues at least give lip service to confidentiality of the shared intelligence and I believe many communities do a pretty good job. However, as intel is shared more widely, it is increasingly likely that the intel will be directly leaked to attackers. This principle drives the strict limitations on distribution normally associated with high value intelligence. It is also generally accepted that it’s impractical to keep data that is widely distributed for enforcement secure from attackers. The vast majority of widely deployed mitigations, such as AV signatures and spam RBLs are accepted to be available to attackers. In this scenario, you engage in an intelligence race, trying to speed the use of commodity intel. This is the antithesis of security intelligence which seeks to mitigate whole campaigns with advanced persistent intelligence.

Note that even if your raw intel isn’t exposed to attackers, the effects are known to the attacker—their attacks are no longer successful. Professional spooks have always faced the dilemma of leveraging intelligence versus protecting sources and methods. If some intelligence is used, especially intelligence from exclusive sources, then the source is compromised. As an example, the capability to decrypt axis messages during WWII was jealously protected. The tale that Churchill sacrificed a whole city to German bombers is hyperbole, but it certainly is representative of the type of trade-offs that must be made when protecting sources and methods. Note that this necessity to protect intel affects it’s use through the whole process, not just merely the decision for use at the end. For example, if two pieces of information are collected that when combined would solidify actionable intelligence, but these are compartmentalized, then the dots will never be connected and the actionable intelligence will never be produced. We see this play out in so called failures of the counter-terrorism intelligence community, where conspiracy theorists ascribe the failings to malice but the real cause is, more often than not, hindrances to analysis.

It’s worth considering how sources and methods apply specifically to network defense. Generally, I believe there is a small subset of intelligence that can be obtained solely through highly sensitive sources that is also useful for network defense. In most cases, if you can use an indicator for counter-measures, you can also derive it yourself, because it must be visible to the end defender. Also, while some sources may be highly sensitive, the same or similar information about attack activity (not attribution), is often available through open sources or through attack activity itself. Obviously, this notion isn’t absolutely true, but I believe it to be the norm. As a counter-example, imagine that a super sensitive source reveals that an attacker has added a drive by exploit to an otherwise legitimate website frequented by the intended victim audience. In this example the intel is still hard to leverage and relatively ephemeral: one still has to operationalize this knowledge in a very short time frame and this knowledge is by definition related specifically to this single attack.

Resting on the qualitative argument of indicator existentialism, the vast majority of counter-measures can be derived from attacker activity visible to the end network defender. This is necessarily true of the most durable indicators. Therefore, I don’t consider protecting sources (for network defense) the biggest consideration and advocate wide sharing of raw attack data. However, that certainly doesn’t mean that the analysis techniques and countermeasure capabilities are not sensitive. Indeed, most of my work in incident response has been about facilitating deeper analysis of network data, allowing durable and actionable intelligence to be created and leveraged. Competitive advantage in this realm has typically been found by either looking deeper or being more clever. In a spear phishing attack, for example, this may be in analysis of obscure email header data or malicious document metadata or weaponization techniques. Often the actionable intelligence is an atomic indicator, say a document author value, which could presumably be changed by the attacker if known. Some may require more sophistication on the part of the defender: requiring novel detection algorithms, correlations, or computational techniques such as that which my pdfrate performs. Either way, the doctrine of security intelligence is based in the premise that persistent indicators can be found for persistent attackers, even if it requires significant analysis to elucidate them. This analysis to identify reliable counter-measures is what security intelligence dictates and is often the opportunity cost of intelligence compliance. I’ve seen some strong indicators continue to be static for years, allowing new attacks to be mitigated despite other major changes in the attacks.

It is my belief, backed by personal experience and anecdotal evidence, that if what would otherwise be a strong mitigation, if kept secret, is used widely, then the lifespan of that indicator will be decreased. In the end I’m not sure it matters too much if the intelligence is directly revealed or if the attackers are forced to evolve due to lack of success, but that probably affects how quickly attackers change. In my experience, it is true that the greater the sophistication on the part of the defender and the greater the technical requirements for security systems, then the less likely useful indicators are to be subverted. However, it’s possible that continued efficacy has more to do with narrow application due to the small number of organizations able to implement the mitigation rather than difficulty of attackers changing their tactics. Often I wonder if, like outrunning the proverbial bear, today’s success in beating persistent adversaries may be more about being better than other potential victims than actually directly beating the adversary. While intelligence driven security, and by extension information sharing, is much more effective than classic incident response, I think it is yet to be proven if ubiquitous intel sharing can actually get ahead of targeted attacks or if attackers will win the ensuing intelligence/evolution race.

One benefit of the still far from fully standardized information sharing and defense systems of today is diversity. Each organization has their own special techniques for incident prevention--their own secret sauce for persistent threats. It’s inevitable that intelligence gaps will occur and some attacks, at least new ones, will not be stopped as early as desired. The diversity of exquisite detections among organizations combined with attack information sharing, even that of one-off indicators, allows for a better chance of response to novel attacks, even if this response is sub-optimal. A trend to greater standardization of intelligence sharing, driven by compliance, will tend to remove this diversity over time, as analysts, systems, and processes will be geared to greater intel volume and lower latency at the expense of intelligence resiliency.

Long Road Ahead

While I’m primarily concerned about being let down when we get there, it’s also important to note that as a community, we have a long pilgrimage before we make it to the ubiquitous intelligence sharing promised land. Mitre’s STIX et al are widely being accepted across the community as the path forward, which is great progress. Now that the high level information exchange format and transport is agreed upon, we still have a lot of minutia to work out. For example, much of that actual schema for sharing is still wide open. For example, many indicator types still have to be defined; standards for normalization and presentation still need to be agreed upon, and the fundamental meaning of the indicators still need to be agreed upon across the community.

I think it’s instructive to compare the STIX suite to the CVE/CVSS/CWE/CWSS/OVAL suite, even though the comparison is not perfect. These initiatives were designed to drive standardization, automation, and improve latency of closing vulnerabilities. There is plethora of information tracked through these initiatives: from (machine readable) lists of vulnerable products, to the taxonomy of the vulnerability type, to relatively complicated ratings of the vulnerability. Despite this wealth of information, I don’t think we’ve achieved the vulnerability assessment, reporting, and remediation nirvana these mechanisms were built to support. Of all the information exchanged through these initiatives, probably the most important, and admittedly the most mundane, is the standardized CVE identifier, which the community uses to reference a given vulnerability. This is one area where current sharing communities can improve—standardized identifiers for malware families, attack groups, and attacker techniques. While many groups have these defined informally, more structured and consistent definitions would be helpful to the community, especially as indicators are tied to these names to provide useful context to the indicators (and provide objective definitions of the named associations). Community agreement on these identifiers is more difficult than the same for vulnerabilities, and building the lexicon for translations between sharing communities is also necessary, as defining these labels is less straightforward and occurs on a per community basis. As we better define these intelligence groupings and use them more consistently in intel sharing, we’ll have more context for proper prioritization, help ensure both better vetted intel and more clear campaign definitions, and have better assurances that our intelligence is providing the value we aim to achieve out of sharing.

In helping assess the effectiveness of information sharing, I think the following questions are useful:
  • How relevant to your organization is the shared intelligence?
  • Is the intelligence shared with enough context for appropriate prioritization and use?
  • How well is actionable intelligence vetted? Is raw attack data shared (for those who want it)?
  • How durable is the shared intelligence? How long does it remain useful?
  • How timely is the shared intel? Is it only useful for detecting historical activity or does it also allow you to mitigate future activity?
  • Do you invest in defensive capabilities based on shared intelligence and intelligence gaps?
  • Do you have metrics which indicate your most effective intelligence sources, including internal analysis?
  • Do you have technology that speeds the mundane portion of intelligence processing, reserving human resources for analysis?

Closing Argument

I’m sure that there are some who will argue that it’s possible to have both security intelligence and intelligence compliance. I must concede that it is possible in theory. However, as there is plenty of room for progress in both arenas, and resources are scarce, I don’t believe there is an incident response shop that claims to do either to the fullest degree possible, let alone both. Also, the two mindsets, analysis and compliance, are very much different and come with a vastly different culture. Most organizations are left with a choice—to focus on analysis and security intelligence or to choose box checking and information sharing compliance.

Similarly, I question the seemingly self-evident supposition that sharing security data ubiquitously and instantaneously will defeat targeted attacks. While there will almost certainly be some raising of the bar, causing some less sophisticated threats to be ineffective, we’ll also likely see an escalation among effective groups. This will force a relatively expensive increase in speed and volume of intel sharing among defenders while attackers modify their tactics to make sharing less effective.

As we move forward with increased computer security intelligence sharing, we can’t let the compliance mindset and information sharing processes become ends unto themselves. Up until the time when our adversaries cease to be highly motivated and focused on long term, real world success, let’s not abandon the analyst mindset and security intelligence which have been the hallmark of the best incident responders.

1 comment:

  1. Quite an article!

    I see 2 distinct areas in practice, which I call "Poets vs. Scientists":

    One is "cyber intel operations", which deals in ingestion, analysis, sensor enrichment, etc. These are the Scientists and the work is primarily reactive.

    The Poets or "Threat Intelligence", deal with managing sources, helping the BUs develop IR/PIR and tasking sources to meet those, plus the standard suite of scheduled/finished products. This work is primarily proactive.

    My thoughts on automation:
    0. You don't need a million indicators until you can fully consume one. Build processes first.
    1. You can't automate intelligence, because indicators are NOT intel. They are, however, the currency of threat intel.
    2. Sharing is beneficial, but my intelligence is still just information to you. Local analysis is key.
    3. Speed does matter. Intelligence has a "half life" and the rate of decay is directly proportional to its precision.

    Keep the articles coming. I'm glad to see some discussions on this.