Heartbleed: How Did It Escape Detection For So Long?
Jeremy Epstein, Senior Computer Scientist, SRI International
APR 10, 2014 17:50 PM
A+ A A-

The past few days have been choc-a-block with discussions of Heartbleed–what it is and how it works, and what the average user should do about it.  (For the former, here is a particularly understandable explanation. The answer to the latter: wait until sites are fixed, and THEN change your password. If it's a high-value site, changing your password before the patch has been installed and a new key/certificate has been generated is counterproductive, as it may make your new password more vulnerable rather than less. Ed Felten provides good advice both for the average user and for website operators on what they should do. Also, the New York Times has a good summary of the big picture.

I'm going to be (moderately) optimistic and suggest that within a week, major sites of all shapes and sizes (banks, e-shopping, government) will have installed the patches to their Web servers and generated new keys/certificates. (That's being optimistic–the realist in me says that there will be some sites that will take months to get patched, because the approval process for big corporations and government agencies is so cumbersome that they can't say "emergency override" and fix the problem quickly.)

But there's been relatively little discussion of two other topics: what types of sites are most vulnerable, and how this vulnerability escaped detection for two years, in one of the most security-critical pieces of software on the Internet.

Besides the major e-commerce, banking, and similar sites, there are three other classes of sites we should also be concerned about.

First, there are the medium-sized companies–too big to use an outsourced hosting provider that will automatically do the patching for them, but not big enough that they have a well-defined process for rolling out an emergency patch to production Web servers. A lot of e-commerce sites fit into this category–and these may well be the riskiest sites. Those using hosting providers– like the mom-and-pop pizza shop–may get upgraded by the provider, but probably won't know that they need to replace their certificates. Certificate Authorities should reach out to their customers to encourage them to get a replacement–but unless they offer significant discounts, that offer may fall on deaf ears.

Second, the products out there that aren't Web servers, but still use OpenSSL. There are lots of these sorts of products, and in many cases the organizations that use them have no idea that OpenSSL is buried deep inside–and the vendor itself may not be aware, since OpenSSL may be embedded in a library that gets embedded, or it may have been inserted by a programmer who left the company years ago. (We saw a scenario similar to the issue of embedded OpenSSL a few years ago when there was a serious vulnerability in a low-end Microsoft database product. We learned during the cleanup that many products had the Microsoft product embedded but even the product developers did not know it was there.)

Third, and scariest, are the embedded devices. How many ATMs, manufacturing control devices, monitoring cameras, etc. use OpenSSL because vendors got burned when it came out that their communications were unencrypted? So the vendors did the "right" thing, embedded OpenSSL–and now perhaps made things even worse. True, these devices aren't likely to have a lot of passwords to be stolen from memory via the Heartbleed vulnerability, but there may be other sensitive information that can be retrieved.

Obviously there's some overlap between the second and third of these, but I separate them out because 2 is fundamentally about "computers" in the traditional sense that are not running Web servers, and 3 is about embedded devices that happen to be running Web servers.

As for how the vulnerability escaped detection for the past few years since it was (presumably) accidentally introduced, only time will tell for sure. While the attack isn't a buffer overrun attack, it's closely related to one–both rely on access to data that's not intended for use by the program.

In a sense, this problem is a "back-to-the-future" manifestation. The un-beloved Trusted Computer System Evaluation Criteria (TCSEC, or Orange Book) required that all systems, regardless of security level, include both analysis and testing for "object reuse" the ability of a program to obtain memory allocated to a different program, or to a different classification level. While this vulnerability is in access to memory within the same program (indeed, within the same library), the lessons of object reuse are that programmers should be extraordinarily cautious with allocated memory, and testers should look for signs of uncleared memory–clearing before access or as part of freeing (or both). For performance reasons, these lessons are too often forgotten in the rush to ship products without adequate testing.

There's a simple solution to this particular problem, and some similar vulnerabilities that we've seen in the past. For the past 50 years or so we've known that using type-safe languages that provide proper memory protection significantly reduces security risks, although they certainly can't solve all of the security problems. 

Unfortunately, eliminating a legacy of C/C++ code will take a generation or more, so this isn't a quick fix.  We may be able to get partway in that direction using safer APIs and compiler enforcement of memory usage, even while we migrate to safer languages. While not directly applicable to OpenSSL, the 

OWASP Enterprise Security API may be a model for methods to reduce risk through APIs that reduce the amount of rope that C gives programmers to hang themselves.

But why didn't the automated source code tests (such as Coverity Scan find the problem? Or did they find it, but the alert was buried in piles of false positives and false negatives? It's too soon to answer that question, but it's something to watch for.

Some versions of OpenSSL have gone through the NIST FIPS 140 testing process. It doesn't appear that the certified versions have the vulnerability, but that's more by luck than by planning, because the FIPS process is focused on the correctness of the cryptographic implementation. FIPS is not about the resilience against attacks of ancillary functions of the software, such as the heartbeat facility which includes the Heartbleed vulnerability.

Could dynamic tools have prevented or discovered this flaw?  Ironically, tools like IBM's Rational Purify Family probably would have stopped exploitation, but are typically not used in production systems because of the overhead (as well as for nontechnical reasons like license costs).  Whether the licensing costs would outweigh the cleanup costs now being faced by thousands of sites is a complex question, which perhaps will get more attention in the wake of Heartbleed.

In retrospect, there are tools that could have reduced the risk.  But this isn't just retrospective–while there have now been two major SSL vulnerabilities found and fixed this year (Heartbleed and the Apple "fail fail"), it's foolish to believe that we've seen the last of the vulnerabilities. We should be using tools and techniques going forward, not just on OpenSSL, but on all of our software.

Have every password and private key been stolen? That threat is probably overblown. But at the same time, we shouldn't draw the line too narrowly–there are a lot of things beyond just "Apache running OpenSSL" that need to be examined.

A key takeaway from this experience is that while open source is probably no worse than closed source (and potentially allows for faster patching), the "many eyes" theory that all bugs are shallow works only if the eyes are qualified and looking. And even if they are looking, the sheer size and complexity of the code can prevent qualified eyes from finding vulnerabilities. Open source isn't a panacea.

Jeremy Epstein

Senior Computer Scientist

SRI International

Associate Editor in Chief, IEEE Security & Privacy Magazine

[An earlier version of this piece appeared on Freedom-to-tinker.com]

[%= name %]
[%= createDate %]
[%= comment %]
Share this:
Please login to enter a comment:

Computing Now Blogs
Business Intelligence
by Keith Peterson
Cloud Computing
A Cloud Blog: by Irena Bojanova
The Clear Cloud: by STC Cloud Computing
Computing Careers: by Lori Cameron
Display Technologies
Enterprise Solutions
Enterprise Thinking: by Josh Greenbaum
Healthcare Technologies
The Doctor Is In: Dr. Keith W. Vrbicky
Heterogeneous Systems
Hot Topics
NealNotes: by Neal Leavitt
Industry Trends
Internet Of Things
Sensing IoT: by Irena Bojanova