Recently I was thinking about random numbers. The numbers that all software engineers use in day-to-day life. Have you ever wondered what the heck is random? When do we call something random?
Let me tell you at the start that the word “random” that we are used to is not random at all. It’s pseudorandom. It’s pseudo because it behaves randomly enough but it’s not actually random.
You could actually repeat all the randomness in the same order given the starting point.
Intrigued? Then let’s explore this topic a little bit more.
Table of Contents
What is Random?
A simple definition of Random would be “some next state that is not known”. In essence, a random number is an unknown number generated from the finite set.
For example – In java, we can generate random numbers using Random.nextInt()
. This function is capable of generating any integer between the range of 232
.
For example – let’s say you have a finite set of memory consisting of 4 bits. You can only generate 2n-1
unique numbers e.g. 24 - 1 = 15
. But the next number that’s going to come out of that black box could be anything. And if you keep generating the numbers, eventually, at some point the pattern will start to repeat itself. This is called the Period.
How To Verify If It’s True?
I couldn’t wait that long for it to generate all the numbers and verify when the pattern starts to repeat… since it uses 48 bits that make it a period of 248-1 something around quadrillion.
But there’s another easy way to verify that it produces the same sequence if provided the same seed.
Well, I created two distinct objects e.g. new Random()
. But instead of letting it choose the seed, I provided the start seed instead. As the following:
var rand1 = new Random(0);
var rand2 = new Random(0);
for (int i = 0; i < 10; i++) {
assertEquals(rand1.nextInt(), rand2.nextInt()); // true
}
This comes out to be correct. If provided the same seed, the random number generator is going to produce the same series.
So, in essence, it is nothing more than a deterministic sequence. If provided the same seed, will generate the same sequence.
Now once I knew this, I tried creating my own random number generator. Maybe not as efficient but working using Linear Feedback Shift Register 🙂
Linear Feedback Shift Register
It’s unbelievably simple. It’s just a set of bits with some rules to move the bits around to generate some output.
For this simple example, I created a shift register that would operate on 4 bits.
The rules I followed are as follows:
- Right shift the state by 1
(state >> 1)
- XOR the initial state with this new partial state and take the most significant bit from it
(state ^ (state >> 1)) & 1
- Now take this new state and feed it back to the partial state from the left
(state >> 1) | ((state ^ (state >> 1)) & 1) << 3)
Let me make it visually clear what I just did above.
1. Let's say following is the initial state
initial_state = 1001
After shifting the state by 1 to the right
partial_state = 0100
removed_bit = 1
2.1 Now we perform the XOR of these two states and take the MST bit
1001 XOR 0100 = 1101 <= New State
2.2 MST = 1
3. Now, put this MST back to the partial_state as a feedback to generate new state
0100 OR 1 = 1100 <= new state
This will generate a new state every time until it reaches the limit. With a 4-bit shift register, it will generate 24 - 1 = 15
unique states before repeating themselves.
After running the code it prints out the following sequence. And it is clear that it didn’t repeat itself until all the 15 unique sequences were produced. So starting from line 1 with the initial state 1001
, the next time it repeated itself is at line 16.
1001
1100
0110
1011
0101
1010
1101
1110
1111
0111
0011
0001
1000
0100
0010
1001
1001
.
.
.
.
Now that this LFSR is ready to produce a deterministic sequence, I started adding the MSTs that I got from step 2.2 to generate a 4-bit number.
So from the above sequence, the following pseudorandom numbers are generated:
(This is using the word size of 4)
10,15,1,3,5,14,2,6,11,12,4,13,7,8,9,10,15,1,3,5,14,2,6,11,12,4,13,7,8,9,10,15,1,3,5,14,2,6,11,12,4,13,7,8,9,10,15,1,3,5...
Do you see the pattern repeating? Let me know in the comments below.
Here’s the code that I used to generate this sequence:
class LinearFeedbackShiftRegister {
private static final int WORD_SIZE = 4;
private final String initialState;
private int state;
public LinearFeedbackShiftRegister(String initialState) {
this.initialState = initialState;
state = Integer.parseInt(initialState, 2);
}
public int nextInt() {
var allCombinations = new StringBuilder();
for (int i = 0; i < WORD_SIZE; i++) {
int newBit = (state ^ (state >> 1)) & 1;
state = (state >> 1) | (newBit << initialState.length() - 1);
allCombinations.append(Integer.toBinaryString(newBit));
}
return Integer.parseInt(allCombinations.substring(allCombinations.length() - WORD_SIZE), 2);
}
}
What are other different ways to generate pseudorandom numbers?
Mersenne Twister Linear Congruential Generator
I will be lazy and copy the description from Wikipedia :p. Here it is,
The Mersenne Twister is a general-purpose pseudorandom number generator developed in 1997 by Makoto Matsumoto and Takuji Nishimura. Its name derives from the fact that its period length is chosen to be a Mersenne prime.
The Mersenne Twister was designed specifically to rectify most of the flaws found in older PRNGs
This is how it can be used to generate a series of random numbers.
It has 4 inputs,
- Modulus (m),
- Multiplier(a)
- Increment(c)
- Seed (starting value)
Let’s take an example to see how it works.
Suppose, we are given the following seed values,
- m = 7829
- a = 378
- c= 2310
- seed = 4321
Applying the formula to generate the next number in the sequence,
Xn+1 = (a * Xn + c) mod m
7216 = (378 * 4321 + 2310) mod 7829
Now, this number will become the seed to produce the next number in the sequence, that is,
3827 = (378 * 7216 + 2310) mod 7829
Keep repeating it again and again to generate new PRNs in the sequence.
This will generate a random number between the range of [0, 7828] (m – 1).
Middle Square Algorithm
This method was invented by John Von Newman first described at some conference in 1949.
This is quite simple to understand and explain to others about the PRNGs but in practice, it is not a good method, since its period is usually very short and it has some severe weaknesses such as if repeated enough times, the middle-square method will either begin repeatedly generating the same number or cycle to a previous number in the sequence and loop indefinitely.
Let’s quickly take a look at how it works:
To start with, take any number and square it.
Then take the middle 4 numbers.
That’s a pseudorandom number.
Then repeat the process.
The sequence can be represented mathematically in the following way:
Xn+1
= middle four digits of (Xn)2
If there are not many middle numbers then pad the number with leading zeros to make it 8 digit number and then extract the middle 4 digits.
This algorithm generates a series of random-looking numbers in the range of 0, 9999.
Conclusion
The world of random is very intriguing. It made me think hard enough about randomness. I think there’s nothing like random if given the state of the system and the rules it follows. Even the Geiger counter. If we somehow know what caused the radiation and the initial state which led to it, then maybe with super quantum computers we might be able to predict the count of electrons in the next interval 😀
Until next time!