When used properly, encryption provides essentially unbreakable protection for sensitive data. Unfortunately, this is much easier to say than to do because of the problems associated with key management—all of the operations other than encrypting and decrypting that are needed to support the secure use of encryption.

This includes things such as how keys are created, stored, and destroyed. NIST’s Special Publication 800-57 Part 1, “Recommendation for Key Management, Part 1: General,” has a good overview of what is needed to do key management securely. This document is 160 pages long, so clearly key management is a hard and complicated process. And it’s too much to cover here.

But one aspect of key management, key generation, deserves a closer look. Here’s why random keys are necessary and how to measure if you are doing this well enough.

### The security of encryption

The security of encryption is determined by how much work an adversary needs to do to crack a key. This is determined by the nature of a particular encryption algorithm. It is not just the size of key that matters. For a particular length of a key, it is easier to crack an RSA key than it is to crack an elliptic curve key, and it is easier to crack an elliptic curve key than it is to crack an AES key.

That’s why standards such as SP 800-57 tell us that a 3,072-bit RSA key, a 256-bit elliptic curve key, and a 128-bit AES key all provide roughly the same level of security. RSA is not less secure than AES—it just takes a bigger key to get the same level of security.

But when standards quantify the work needed for an adversary to crack a key, they assume that an adversary does not know how to take any shortcuts that make this easier than it needs to be. This means that he does not have any information about which keys are more or less likely to be used.

In other words, keys need to be random to be as secure as possible. If they are not generated randomly, then an attacker can use the lack of randomness to crack a key with less work than it would take to crack a random key, and that means that the non-random key is providing less security than a random alternative would.

This was the basis for the “export-grade” encryption that was required in the pre-dot-com era. Software that used 128-bit keys could be exported, but those 128-bit keys had to have a significant non-random part, leaving only 40 of the 128 bits that were unknown to an adversary. Back then, such weaker keys were required by government regulations, but if you use keys that are not generated randomly, you can be using such export-grade encryption without knowing it.

### Randomness, not entropy

In *Jacobellis v. Ohio*, Justice Potter Stewart noted that although it may be difficult to define “obscenity,” he knew it when he saw it. Robert Pirsig made a similar claim for “quality” in his book *Zen and the Art of Motorcycle Maintenance*. This sort of vague thinking might be good enough for the lawyers and philosophers, but we really need a more careful way to define randomness in our context. The good news is that is easy to check to see if a source of randomness is bad. The bad news is that it is much harder to check to see if a source of randomness is good.

The most common metric used to quantify the randomness of a source of random bits is entropy. This is unfortunate, because entropy seems to be a poorly understood concept. In particular, data that is very random always has a high level of entropy, but data that has high entropy is not necessarily very random.

This is because entropy is stateless—it is a statistic that is calculated from a sequence of samples without any knowledge of what came before or what will come after a particular sample. If our samples are individual bits, then the sequence 0, 1, 0, 1, 0, 1, … will have the highest theoretical level of entropy possible (one bit of entropy per symbol), but it is clearly not random at all.

Similarly, if entropy is calculated based on samples being individual bytes, as the popular tool ent does, then the sequence 0x00, 0x01, 0x02, …, 0xff repeated many times will similarly have the highest theoretical level of entropy (eight bits of entropy per symbol), but is also clearly not random.

So just because we have a lot of entropy, we do not necessarily have much randomness. And because it is randomness, not entropy, that makes cryptographic keys strong, this means that using entropy to quantify the goodness of a source or random numbers can be risky.

There are statistics other than entropy that we can use to make sure that data from a source of randomness will produce output that will be hard for an adversary to guess. NIST’s SP 800-90A Rev. 1, “Recommendation for Random Number Generation Using Deterministic Random Bit Generators,” specifies the use of min-entropy (sometimes known as “min entropy” or “minimum entropy”), a statistic that is closely related to entropy but does a better job of quantifying how hard it is for an adversary to guess a random value.

SP 800-90A even defines “entropy” to be what is technically known as min-entropy, so be particularly careful when reading this standard. And because min-entropy uses the same stateless model that entropy does, it has the same limitations—it is easy to construct simple examples of data that has high min-entropy but is clearly not random.

Fortunately, this limitation is well-understood, and NIST’s recent (January 2018) Special Publication 800-90B, “Recommendation for Entropy Sources Used for Random Bit Generation,” lists tests that can be used to make sure that your source of randomness is not trying to pull a fast one on you by taking advantage of the quirks of how entropy is defined and calculated.

### How reasonably random should your numbers be?

So if you follow the latest standards, you should be able to get a reasonable assurance that any source of randomness that you are using to generate cryptographic keys is reasonably random. This is not as simple and easy as you might hope, but it is definitely much better than just relying on entropy estimates. Export-grade encryption might have been acceptable back in the ’90s; it is probably not something that you want to use today. Not even unintentionally.

#### Keep learning

**Get up to speed**on unstructured data security with TechBeacon's Guide. Plus: Get the Forrester Wave for Unstructured Data Security Flatforms, Q2 2021.**Join this discussion**about how to break the Ground Hog Day repetition with better data management capabilities.**Learn how**to accelerate your analytics securely into the cloud in this Webinar.Find out more about cloud security and privacy, and selecting the right encryption and key management in TechBeacon's Guide.

**Learn to appreciate**the art of data protection and go behind the privacy shield in this Webinar.**Dive into the new laws**with TechBeacon's guide to GDPR and CCPA.