-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
174 additions
and
0 deletions.
There are no files selected for viewing
174 changes: 174 additions & 0 deletions
174
blog/2024-04-18-random-numbers-with-uniqueness-types.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,174 @@ | ||
--- | ||
title: Avoiding RNG bugs through uniqueness types | ||
description: A programmer complained about a source of bugs, and this is how I think it could be fixed. | ||
--- | ||
|
||
While visiting [Chalmers](https://www.chalmers.se/en/) to give a guest | ||
lecture in their course on Parallel Functional Programming, I met a | ||
student who had implemented a ray tracer in Futhark (I didn't catch | ||
the name, please write me if you are that student and I'll put it | ||
here). He mentioned that he'd had a bunch of bugs related to random | ||
number generation (RNG), which is admittedly a bit awkward in Futhark. | ||
The main challenge is that you manually need to maintain the RNG | ||
*state*, since Futhark does not allow side effects. | ||
|
||
To illustrate the problem, let us define a small set of functions for | ||
generating random numbers. First, we define a type for storing the RNG | ||
state: | ||
|
||
```Futhark | ||
type rng = u32 | ||
``` | ||
|
||
To initialise the state from a seed, we do a few rounds of a hash | ||
function I [found on Stack | ||
Overflow](https://stackoverflow.com/questions/664014/what-integer-hash-function-are-good-that-accepts-an-integer-hash-key/12996028#12996028): | ||
|
||
```Futhark | ||
def mk_rng (seed: i32) : rng = | ||
let x = u32.i32 seed | ||
let x = ((x >> 16) ^ x) * 0x45d9f3b | ||
let x = ((x >> 16) ^ x) * 0x45d9f3b | ||
let x = ((x >> 16) ^ x) | ||
in x | ||
``` | ||
|
||
For the random number generation itself, for simplicity we will just | ||
do a basic [linear congruential | ||
generator](https://en.wikipedia.org/wiki/Linear_congruential_generator): | ||
|
||
```Futhark | ||
def rand (l: i32) (x: rng) : (rng, i32) = | ||
let a = 48271 | ||
let m = 2147483647 | ||
let rng' = (a * x) % m | ||
in (rng', i32.u32 (rng' % u32.i32 l)) | ||
``` | ||
|
||
Note now the `rand` function returns both a new state, as well as the | ||
randomly generated number in the range 0 to `l`. We can use `rand` like | ||
this: | ||
|
||
```Futhark | ||
def use (seed: i32) = | ||
let rng = mk_rng seed | ||
let (rng', x) = rand 6 rng | ||
let (rng'', y) = rand 6 rng' | ||
in x + y | ||
``` | ||
|
||
This works fine. However, it is easy to use an old RNG state by | ||
accident: | ||
|
||
```Futhark | ||
def use (seed: i32) = | ||
let rng = mk_rng seed | ||
let (rng', x) = rand 6 rng | ||
let (rng'', y) = rand 6 rng | ||
in x + y | ||
``` | ||
|
||
Note how I typed `rng` instead of `rng'` in the last call to `rand`. | ||
In this case the compiler will complain about `rng'` being unused, but | ||
it's not hard to imagine a larger program where `rng'` is indeed used | ||
for something else later. Especially when refactoring, it is easy to | ||
accidentally reuse the same RNG state twice, which will lead to | ||
randomly numbers being correlated. For a ray tracer, this can result | ||
in fun visual artefacts, but for other programs it may just result in | ||
a number being wrong, which is both boring and tedious to debug. | ||
|
||
In an imperative language, generating a random number mutates the | ||
state, so it cannot be reused. In languages Haskell, you can use a | ||
state monad to simulate the same thing, and similarly avoid reuse. In | ||
Futhark, it turns out you can imitate a form of *affine types* using | ||
Futhark's [slightly obscure support for uniqueness | ||
types](2022-06-13-uniqueness-types.html). Affine type allows you to | ||
express that a value can be used *at most once* (whereas linear types | ||
allow you to require *exactly once*, which is why they are useful for | ||
resource management, as then the last use must be a cleanup function). | ||
By constructing an RNG library such that number generation *consumes* | ||
a state and *produces* a new one, we can ensure that each state is | ||
used at most once. | ||
|
||
Although uniqueness types are really designed for dealing with arrays, | ||
they can also be used for abstract types via [the module | ||
system](https://futhark-lang.org/blog/2017-01-25-futhark-module-system.html). | ||
So first we define a module that describes the RNG interface: | ||
|
||
```Futhark | ||
module type rand = { | ||
type rng | ||
val mk_rng : i32 -> rng | ||
val rand : i32 -> *rng -> (rng, i32) | ||
} | ||
``` | ||
|
||
Note the asterisks on the `rand` parameter type - this denotes a | ||
consuming parameter, meaning the `rng` we pass in may not be used | ||
again. | ||
|
||
We implement the module using the same code as above: | ||
|
||
```Futhark | ||
module rand : rand = { | ||
type rng = u32 | ||
def mk_rng (seed: i32) : rng = | ||
let x = u32.i32 seed | ||
let x = ((x >> 16) ^ x) * 0x45d9f3b | ||
let x = ((x >> 16) ^ x) * 0x45d9f3b | ||
let x = ((x >> 16) ^ x) | ||
in x | ||
def rand (l: i32) (x: rng) : (rng, i32) = | ||
let a = 48271u32 | ||
let m = 2147483647u32 | ||
let rng' = (a * x) % m | ||
in (rng', i32.u32 (rng' % u32.i32 l)) | ||
} | ||
``` | ||
|
||
We can only access the functions through the types defined in the | ||
module type, which means `rand` will consume its `rng` argument, | ||
despite the actual function not doing anything odd. Code like this | ||
will now work: | ||
|
||
```Futhark | ||
def use (seed: i32) = | ||
let rng = rand.mk_rng seed | ||
let (rng', x) = rand.rand 6 rng | ||
let (rng'', y) = rand.rand 6 rng' | ||
in x + y | ||
``` | ||
|
||
But if we try to reuse an RNG state, the type checker will tell us: | ||
|
||
```Futhark | ||
def abuse (seed: i32) = | ||
let rng = rand.mk_rng seed | ||
let (rng', x) = rand.rand 6 rng | ||
let (rng'', y) = rand.rand 6 rng | ||
in x + y | ||
``` | ||
|
||
``` | ||
Error: Using variable "rng", but this was consumed at 3:31-34. | ||
``` | ||
|
||
If we *want* to duplicate an RNG state, we can still do so, as the | ||
`copy` prelude function can copy anything you can put in an array. | ||
This is explicit, and so unlikely to lead to unintended behaviour. | ||
|
||
The most widely used Futhark library for random numbers is | ||
[cpprandom](https://github.com/diku-dk/cpprandom), which does not use | ||
this approach, but I'm wondering whether it would be better if it did. | ||
However, in practice, bugs like this are not too difficult to avoid, | ||
if we simply use shadowing to make the old RNG states inaccessible: | ||
|
||
```Futhark | ||
def use (seed: i32) = | ||
let rng = mk_rng seed | ||
let (rng, x) = rand 6 rng | ||
let (rng, y) = rand 6 rng | ||
in x + y | ||
``` |