there’s apparently technical guidelines1 that talk about turning random bits into random numbers between 0 and something other than powers of two. unfortunately it’s lacking a bit of nuance.

they provide two algorithms for generating random numbers in the set ${1,…,n-1}$, both use $k = \lceil\log_2 n\rceil$:

  1. This algorithm maintains uniform distribution but has probabilistic run-time.

    1. $r = \mathrm{RNG}({0,1,2,…,2^k-1})$
    2. If $(r < n)$ and $(r > 0)$, output $r$ else goto 1.
  2. This algorithm has deterministic run-time but does not fully maintain uniform distribution.

    1. $r = \mathrm{RNG}({0,1,2,…,2^{k+64}-1})$
    2. Output $(r \mod (n-1)) + 1$

they include this helpful note:

Note: The usage of a non-uniformly distributed $\textrm{RNG}({1,2,…,n−1})$ can enable an attack on signature algorithms (cf. Bleichenbacher’s attack on DSA, described e.g. in [28]). Algorithm 2 does not provide uniform distribution. It is however assumed that the deviation from uniform distribution produced by Algorithm 2 is too small to be exploited by an attacker.

Ed25519, for whatever reason, chose to do algorithm 2 anyway:

EdDSA samples r from the interval $[0, 2^{2b} − 1]$, ensuring almost uniformity of the distribution modulo $\ell$. The guideline [2, Section 4.1.1, Algorithm 2] specifies that the interval should be of size at least $[0, 2^{b+61} − 1]$, i.e., 64 bits more than $\ell$; for Ed25519 there are 259 extra bits.

aside from the odd typo, this is very comforting, we’re 195 bits better than the guideline!

but how much deviation from a uniform distribution does the guideline guarantee, and what do those 195 bits get us?

i’m going to measure deviation as the sum of the absolute values of the difference between the uniform probability and the actual probability over the entire target set. that is:

$$d = \sum_{i=0}^{n-1} |P(i)-\frac{1}{n}|$$

given that were reducing our input ${0,…,r}$ modulo $n$, there are only two different probabilities:

$$P(i) = \begin{cases} \frac{c + 1}{r}, & \text{if $i < r, \mod n$} \\ \frac{c}{r}, & \text{if $i ≥ r, \mod n$} \end{cases}, \text{where $c = \lfloor\frac{r}{n}\rfloor$}$$

so our whole deviation function becomes, switching to Haskell:

deviation :: Integer -> Integer -> Rational
deviation n r = low + high where
  (c, r') = r `divMod` n
  low = r' * abs (((c+1) % r) - (1 % n))
  high = (n - r') * abs ((c % r) - (1 % n))

time to make graphs!

a graph of the deviation when reducing from numbers modulo 256 to numbers modulo 129 to 255, and from 512 to 257 to 511. the two graphs overlap and have their peak slightly less than half way

very tidy, with a peak deviation of 0.343 right around $n = \frac{r}{\sqrt 2}$…

a graph of the deviation when reducing from numbers modulo 2^72 to numbers modulo 129 to 255, and from 2^73 to 257 to 511. the two graphs overlap, with every point of the first group coinciding with one of the second group

this one is a lot more scattered. peak deviation of $2.62\cdot10^{-20} \approx 2^{-65}$ almost at the right-hand side. that’s great! we’ve reduced our maximum deviation by a factor of $2^{64}$ by sampling from an extra 64 bits! so that should work the same for curve25519 scalars too, right?

a graph of the log base 2 of the deviation when reducing from numbers modulo 2^253 to 512 to numbers modulo ell, and to numbers modulo 2^252+2^251. the first starts out at -126 and runs horizontally until it hits 380 bits and then starts diagonally down. the second starts close to zero, and runs diagonally down, eventually meeting the first.

as it turns out, when your numbers get big, it really matters what range you’re reducing to. reducing a random 253 bit number modulo $2^{252} + 7237005577332262213973186563042994240857116359379907606001950938285454250989$ (the group order of curve25519), is more uniform than reducing a random 317 bit number modulo $2^{252}+2^{251}$. and if we can assume the deviation of $2^{-65.6}$ is too small to be exploited, does going from $2^{-126.6}$ to $2^{-261.5}$ really get us anything?

  1. TR03111 §4.1.1 “Random and Pseudo-Random Number Generators” ↩︎