Randomly not so random

In a random sequence, all sequences are equally likely, even not so random ones.
    Random random = new Random(441287210);
    for(int i=0;i<10;i++)
        System.out.print(random.nextInt(10)+" ");
    }
prints
1 1 1 1 1 1 1 1 1 1
and
    Random random = new Random(-6732303926L);
    for(int i=0;i<10;i++)
        System.out.println(random.nextInt(10)+" ");
    }
prints
0 1 2 3 4 5 6 7 8 9
Lastly
public static void main(String ... args) {
    System.out.println(randomString(-229985452)+' '+randomString(-147909649));
}

public static String randomString(int seed) {
    Random rand = new Random(seed);
    StringBuilder sb = new StringBuilder();
    for(int i=0;;i++) {
        int n = rand.nextInt(27);
        if (n == 0) break;
        sb.append((char) ('`' + n));
    }
    return sb.toString();
}
prints
hello world

Comments

  1. How do you find out the appropriate numbers for the result?

    ReplyDelete
  2. Lol. Actually the period is only like 4 billion. You can brute force pretty much any sequence match to see if it exists in O(4Billion*k) where k is the size of the k-mer.

    If you have an English dictionary you can index the whole thing into the rand generator in O(4Billion * log(DictionarySize)). What would be uber-cool is come up with a compression scheme that takes generates k-mers of a string, then matches against the entire Rand() sequence; then replace parts of the string with pairs of Rand() seeds and generator lengths.

    ReplyDelete
  3. Actually how many GB is the entire period? 2^32*32 bits (assuming the back end of your PRNG is hitting every number exactly once) == 16 GB.

    Multiply that by a factor of 4 and add a smidge to overlap a bit from back to front; and it is 100GB to store the suffix array for the entire period.

    ReplyDelete
  4. This is sexy! I'm going to figure this out

    ReplyDelete
  5. Funny, however one should expect that for every 1E9:th seed, all the ten digits would be the same. Thus, it is likely that there are about 18446744073 such long seeds and "441287210" is just one of them...

    ReplyDelete

Post a Comment

Popular posts from this blog

Java is Very Fast, If You Don’t Create Many Objects

System wide unique nanosecond timestamps

Comparing Approaches to Durability in Low Latency Messaging Queues