A home-brew 18 MBit/s True Random Number Generator based on Thermal Noise

This page describes how to build a high-bandwidth True Random Number Generator (TRNG) from readily available parts. The significance of this project lies in the high data rate. Generating a One-Time Pad of 1 GByte with this TRNG takes less then 10 minutes. You can burn the data on CD-ROM and share with your friends.

Why Bandwidth matters

The only crypto system with provable security is the one-time pad. It is often described as unpractival because it requires the same amount of key material as there are data to transmit. However, with the high-volume and small-sized storage media available today - you can purchase 16 GByte micro SD cards off the shelf -  key storage size is not a real problem for many conceivable applications of one-time pad encryption. While video streaming would still be somewhat demanding, a 1 GByte one-time pad is sufficient to encrypt 17 hours worth of an ISDN quality two-way voice link. For text messaging like regular e-mail, you would virtually never exhaust a one-time pad of that size. The only disadvantage in comparison with asymmetric methods is that you need to securely deliver the one-time pad, e.g. by meeting in person or by courier.

The real problem is not the storage of the key material - it is the generation of the truly random data. With a moderately fast randomness source of, say, 100 kBit/s, it will take you about a day to produce 1 GByte. With the TRNG described here, you can do the same in less than 10 minutes - which makes one-time pad generation much more practical.

The Randomness Source

Low-noise satellite LNBs are widely available at low cost. You may get one with a noise figure of 0.3 dB for 15 EUR. The one used here has a noise figure of 0.5 dB which translates to a noise temperature of 35 K. This means that the output signal will be dominated by thermal noise when the LNB is pointed to a surface at a regular ambient temperature of around 290 K.

The function of an LNB can roughly be described as transforming input frequencies around 11 GHz with a bandwidth of 1 GHz down to an intermediate frequency range of 1 to 2 GHz (this is basically cut & paste in Fourier space). Due to about 60 dB amplification inside the LNB, the output power is roughly 10e6 times the input power. Thus, any following stages add only little to the overall noise.

If your home is quiet at microwave frequencies, you could simply put the LNB indoors to pick up the 290 K thermal noise around 11 GHz. But note that you may have relatively strong microwave sources in your home that you are not aware of: Fluorescent lamps and the modern energy saving lamps create a hot electron plasma inside. This radiates strong noise on microwave frequencies. Gas discharge tubes are used in microwave technology as noise sources. So the lamps in your home may actually be good randomness sources too, but they have the disadvantage that the amplitude of their noise is modulated by the AC current flowing through them. Regular fluorescent lamps exhibit a stroboscope effect on fast moving objects, and the RF noise they emit will be similarly modulated as the emitted light. The energy saving lamps use higher frequencies (around 20 kHz or so). As an LNB is sensitive enough to pick up good old trusted thermal noise, it is best to stick with that.

Ideally, one would put a microwace absorber in front of the LNB and put everything into a metal shielding box. The noise received would be the thermal noise emitted by the absorber - just like everyone speaks about black body radiation in the optical regeime. It may be possible to improvise a microwave absorber using materials like paper and graphite spray, but I have not tried that yet.

Frequency Down Conversion

Because the output frequency range of the LNB is still to high to allow samping the noise with simple means, further down conversion is necessary. The tuners of satelite set-top boxes of current technology convert the LNB signal down to baseband, limiting the bandwidth according to the chosen transponder symbol rate. Direct conversion I/Q demodulation is used. This means that the tuner actually produces two output signals, and if the input to the LNB is Gaussian white noise (as thermal noise is), the I and Q signals should even be statistically independant. However, we will just use one of them. The noise spectrum of both signals ranges from 0 to about 50 MHz when a symbol rate of 50,000,000 is configured in the menue of the set-top box. The tuners typically use differential outputs, i.e. each of the I and Q signals uses two wires with a common DC offset but opposite AC phase.

For this project, I purchased a cheap satellite set-top box for 35 EUR. I was not looking for a particular model, and you probably can use many other models. You will just have to do the reverse engineering again that is described further below to make the box work smoothly as part of the TRNG.

Digitising and Sampling the Random Noise Signal

The tuner output signals are strong enough to digitise them with a fast analog comparator. In the design described here, the digitised signal is sampled at a rate of about 24 MHz. To achieve this, a microcontroller board that provides a fast synchronous serial interface is used.

Data Transfer to a PC

The microcontroller board used features an Ethernet interface, which is used to send the sampled raw data to a PC in UDP packets. 3000 packets with 1024 bytes of sampled data are sent per second. The firmware uses the DMA features of the microcontroller to shovel the data from the serial input to the Ethernet transmitter, using double buffering. The CPU does not actually touch the data on transfer (and might not be fast enough). An USB interface is also present, but why bother implementing a bloated protocol like USB when an Ethernet interface is at hand? The firmware developed simply sends UDP packets to the Ethernet and IP broadcast addresses with hard coded source addresses. As the 100 MBit/s link is busy more than 24% of the time, it is anyways best to use a dedicated interface on the PC. The data can be read by an application on the PC by simple socket operations.

Data Quality and Entropy Extraction

Statistical analysis of the relative frequencies obtained by breaking the data stream up into words of length 1 through 24 bits suggests that the raw data have an entropy well above 0.99 bits per physical bit. Thus, reducing the data rate to 75% during entropy extraction is a conservative choice. One simple method is this: Collect 4096 bytes of data, partition them into blocks of 16 bytes, and XOR these together to create an AES128 key. This key is then used to AES encrypt the further raw data in blocks of 16 bytes, but using only the first 12 of the 16 output bytes of each ciphertext block for the cooked data stream. Due to the reduction to 75%, the net data rate is 18 MBit/s. To make sure that the noise source does not fail, the program provided below performs an entropy test on each 4096 byte input block.

Components used in this Project

The following particular components were used in this project:
Item
Model
approx. Price
LNB
No-Name, NF 0.5 dB
15 EUR
Set-Top Box
Palcom DSL-380
35 EUR
Analog Comparator
Analog Devices AD8561AN
5 EUR
Microcontroller / Board
Atmel AT91SAM7X256 / Olimex SAM7-EX256
100 EUR

The most expensive component is the microcontroller board. You may be able to find cheaper ones. The board used here has several features that are not used for the TRNG.

Particular Problems and Solutions

Some reverse engineering of the set-top box was necessary to make it usable for the TRNG project. Besides the need to locate the tuner output pins, it was necessary to add a switch that allows to disable the main processor after it has configured the tuner. Because the processor expects to see an MPEG data streem which never appears in the TRNG application, it kept resetting the tuner periodically, also interruting power supply to the LNB. We merely want the main processor to say the magic spell to the tuner chip (via I2C bus) that configures it to the highest supported bandwitdh, but after that we want it to remain silent. However, disabling the main processor has some side effects.

The modifications done to the box are summarised below:

Objective
Solution
Side Effects
Tap into tuner output.
A slot was cut into the back side of the box, and a tree-pin socket was glued with epoxy onto the tuner. The three pins were conected to I+, GND, and I- (or Q+, GND, Q-).
Box looks more fancy.
Disable main processor after it has configured the tuner.
A switch was connected to the reset pin of the JTAG interface on the main PCB. After start-up, the switch is used to hold the main processor permanently in reset. The switch was mounted using the hole for the optical audio out socket, wheich was removed.
1. Power supply to the LNB is switched off.
2. The analog gain control input to the tuner is switched off (very low gain).
3. Box looks even more fancy.
Supply power to the LNB permanently.
Unsolder transistor Q3. This transistor is used in the network that controls the 14 V / 18 V power supply regulator. Removing Q3 lets the control voltage go high permanently because of a pull-up resistor.
none
Generate a suitable gain control signal for the tuner.
A 100 kOhm trim pot was connected to +5V via a 2.2 kOhm resistor, and to ground at the other end. The centre pin is conected to the gain control inpot of the tuner. The trim pot now allows to adjust the gain.
none

Using the configuratin menue, switch the symbol rate to 50 Million/s. The menue allows to type in larger values, but as the tuner does not support them, you get a very narrow-band output instead of what you expect. It is best to check the tuner output with an oscilloscope. A digital one with FFT option is ideal.

What it looks like

Overwiev image of assembled RNG:
View of assembled TRNG

Close-up view of part of the TRNG:
Part of TRNG

Modification at the tuner (3-Pin Socket for I+/GND/I-, Trim Pot for Gain Control): Tuner Modification

Small PCB with AD8561AN Comparator (plugs into UEXT socket of SAM7 board):
Comparator PCB

Wiring of Reset Switch (green = GND, blue = RST via 1 kOhm):
Reset Switch

Location of removed Transistor Q3 (close to 14 V / 18 V regulator with heatsink):
Q3 Location

Software Downloads:

Firmware-20090107.tgz
Tarball with sources for AT91SAM7X firmware, including compiled output (in binary and S-Record formats). Compilation requires ARM cross compiler (arm-elf-gcc).
udp_rx.c UDP receiver software. Compile like this:
cc -Wall -O3 udp_rx.c -o udp_rx
testract.c
Utility for entropy extraction using AES, with on-line test of input data. Compile like this:
cc -Wall -O3 testract.c -o testract -lm -lssl

How to Run:

Use whatever tool you prefer to flash the firmware into the SAM7 and make the flash bootable. The firmware then starts immediately after reset, sending 3000 UDP packets of sampled data per second. Connect your PC via Ethernet. The packets use UDP port 1536 and source IP 192.168.0.153, so you will have to configure your network card accordingly. Note that udp_rx writes raw data to standard output, and that testract pipes from standard input to standard output.

Make sure that you understand how to start up the set-top box correctly. The reset switch must be activated after the box has booted and the tuner has been configured to high bandwidth. When you switch off the DSL-380 using the power switch at the back, it will resume using the last settings on the next power-up. However, when you push the on/off button on the front panel, you may have to use the remode control again to switch to some favourite channel configured with the 50 Million/s symbol rate.

To collect some raw data, you can simply run a comman like

udp_rx >sample_raw.bin


and interrupt with ctrl-C once you have enough, or control the amount of data with dd like this:

udp_rx | dd bs=1024 count=20480 of=sample_raw.bin


This stops data collection when 20 MB have been received.

The testract utility checks the quality of input on-line data by calculating the byte entropy for each input block of 4096 bytes. It aborts when the calculated entropy is below a threshold given on the command line (2nd argument). Note that the byte entropy is somewhat underestimated due to the small sample size of 4096 bytes. The data are compressed using AES as described above. The number of bytes taken from each AES output block is also specified on the command line (1st argument).
To cook the raw sample file generated above, you may do this:

testract 12 7.8 <sample_raw.bin >sample_cooked.bin


To generate a 1 GB one-time pad in one go without intermediate storage of the raw data, you may run a command like this:

udp_rx | testract 12 7.8 | dd bs=1024 count=1048576 of=cooked.bin


On my computer, dd printed the following information about the data transfer speed at the end:

1048576+0 records in
1048576+0 records out
1073741824 bytes transferred in 466.669 secs (2300863 bytes/sec)

2,300,863 bytes/s, that is 18,406,904 bits/s.

Sample Data

sample_raw.bin
20 MB
raw data, generated as described above
sample_cooked.bin
16 MB
data of sample_raw.bin after piping through testract 12 7.8

Note that the raw data may fail some strict randomness tests. They are not supposed to be a perfect random stream, but merely have to provide more input entropy than is extracted. Statistical analysis assures that this is the case. For example, the Shannon entropy computed from the relative frequencies of 24 bit words in a 1 GB sample was 23.95 bits per word.