This blog is also available on Gemini (What's Gemini?)

Perl Encryption Primer: The Importance of Randomness


I'll be doing a presentation for MadMongers in April on encryption in Perl, and I'm going to try something a little different. This series of blog entries will lay out a written primer on encryption in Perl, which will then form a condensed outline for the presentation. I feel the final presentation should have high-quality information, so hopefully anything I get wrong can be pointed out and corrected before then.

We'll start with the most fundamental building block of encryption, which is random numbers. Now, a naive new programmer might jump right in and start calling `rand()`:

$ perl -E 'say int rand(10_000)' 4881

The `int` truncates the decimals; it would otherwise return a float. The parameter to `rand()` tells it the maximum number, which is 1 by default. The code above will return a number between 0 and 10,000.

Here, we already hit two different hidden snags:

1. Random numbers are surprisingly difficult to do right 2. Computers are deterministic machines. You can't get much randomness out of determinism.

The `rand()` function starts with a seed. You can put in your own seed with `srand()`, or you can start out with the seed Perl gives you by default. The means Perl uses to get that initial seed is actually quite complicated; look at util.c[1] and search for the function "Perl_seed" for the gory details.

Notably, if we put in our own seed, call `rand()` a few times, and then do it again, we'll get the same output:

$ perl -E 'srand(1234); say int rand(10_000) for 1..4; say "RESEED"; srand(1234); say int rand(10_000) for 1..4' 7408 2145 3381 3231 RESEED 7408 2145 3381 3231

Same pattern each time! This means that an attacker doesn't need to know the output of `rand()`. They just need to know what `srand()` input you started with. Let's say you give `srand()` a 32-bit input. You then use that generate a 128-bit key for a cipher. This means that your cipher doesn't have 128-bits of security; the attacker only needs to brute force the 32-bits that you put into `srand()`, which is a far easier task.

This brings us to an important consideration in security, which is building things in chains versus building things in layers. A chain is only as secure as its weakest link. In the above, the 128-bit block cipher only had 32-bits of security, because `srand()` was part of its chain.

Building in layers is more like a castle. Breaching the outer wall doesn't mean you've breached the inner walls. Breaching the inner walls still leaves you the Keep to deal with. The Keep may also be built of narrow, maze-like hallways with the defenders attacking from above. This is defense-in-depth. We won't be explicitly covering building in layers in this post, but it's good to keep this in mind when building security systems.

For this and other reasons, `rand()` is not the right choice for a cryptographic-strength random number generator.

There are random number generators that can take a longer seed, but this leads to another question: where did you get the seed from? If it's a timestamp, the attacker may be able to make a rough guess of the time period when you first generated that key. That could easily end up with even less than 232 possibilities (as in the case of 32-bits of randomness). Whatever else you use, can you be confident that an attacker can't narrow down the input seed?

We end up needing to be really careful about where we get our random numbers. Since computers are such deterministic machines, we need a source that is somehow external to the computer.

Preferably, all computers would come with a special piece of hardware. Intel and AMD both used to put one in their chipsets (i8xx and 76x-based motherboards, respectively), but these fell into disuse. The VIA PadLock Security Engine[2] also has a hardware RNG on their CPUs. Intel has also returned the hardware RNGs with RdRand[3] on IA-64 CPUs.

If you don't have a specific CPU, then there are also ways to add hardware for generating RNGs. LavaRnd[4] was a project that used a web cam in a dark environment to pick up on cosmic background radiation, which can be used for randomness. Sadly, the project hasn't been updated in quite some time. There are a number of other hardware RNGs to purchase[5], ranging from cheap to very expensive. I would urge careful research before buying any of them.

If we can't buy hardware, then there's an option to take randomness from the world around the computer. Consider that there is a slight variation between the timing of your keystrokes and mouse movements. By having the OS kernel listen in on this data, a random number generator can use this as input. This and a few other things is what the Linux `/dev/random` driver does. Similar techniques are used on Windows, Mac, and most everything else.

One problem with this approach is headless servers. Since they rarely have keyboard or mouse in use, the few remaining sources might not be enough. `/dev/random` will block until it gets more random numbers. `/dev/urandom` will continue onward using lower-quality sources, which isn't what we want, either. This is where proper hardware random number generators come into play.

No matter what OS you use, Crypt::Random[6] is the Perl module to use. This will use `/dev/random` on systems that have it, and the userspace Entropy Gathering Daemon on systems that don't. It's easy to use:

$ perl -MCrypt::Random=makerandom -E 'say makerandom( Size => 32, Strength => 1 )' 2928315689

The `Size` parameter is the number of random bits to generate, and `Strength` set to 1 says that we won't fallback to less-secure alternative.

Next post will be on One Time Pads, which is an encryption algorithm so good that we can never use it.