Thursday, October 25, 2012

BURP Sequencer and Randomness

So I stumbled into http://www.tssci-security.com/archives/2007/12/21/testing-for-randomness-and-predictability-using-burp-sequencer/ a while ago and it had me wondering about the sequencer tool. Unfortunately the authors didn't really analyse the results in detail and only looked at the summary which is a little misleading.

I've done some testing of my own to duplicate/clarify these results.

I repeated the same test as the author. 10000 16 character alphanumeric strings from random.org. The following was output as the summary in sequencer:

"The overall quality of randomness within the sample is estimated to be: poor. At a significance level of 1%, the amount of effective entropy is estimated to be: 26 bits."

Now to explain this. First of all, the character-level analysis looked ok, a few anomalies in the 'transitions' but overall not too bad. The first thing I noticed looking at the bit-level analysis is Burp was only reporting up to 79 bits, even though the tokens were 16 characters long.

After I RTFM [1], I found that Burp creates a custom encoding for tokens on the fly based on the size of the character set used at each character position. I had 16 character positions each with a charset size of 62 characters. Since 62 is not a power of 2, there will be 'information loss' (according to the sequencer manual [1]) meaning rather than use a 6 bit encoding (the number needed to encode 62 unique items), burp just uses 5 bits and some of our data can't be represented in the bit-level analysis.

Extrapolating a bit, I'm guessing what it does internally is drop the most significant bit off of each of the 6 bit character encodings, the result of this is that 32 of our 62 original characters will encode to the same value in the bit-level analysis! It is no surprise that such an encoding would fail a randomness test.

Being lazy and not having access to the source code, I decided to take a bit of a shortcut to pseudo confirm this hypothesis. I generated a new set of 10000 16 character tokens, this time from the charset "a-zA-Z0-9#!" giving a total of 25 characters. Here is the glorious result in the summary:
"The overall quality of randomness within the sample is estimated to be: excellent. At a significance level of 1%, the amount of effective entropy is estimated to be: 86 bits."

EDIT: Repeated the test for a 16 character charset and got the following, as expected: "The overall quality of randomness within the sample is estimated to be: very good. At a significance level of 1%, the amount of effective entropy is estimated to be: 71 bits."

tl;dr: If your charset is not a power of 2, don't put too much stock into the summary or bit-level analysis in burp sequencer. In fact, the closer to a power of 2 it is without being one, the worse this will be. Instead just use the character-level analysis, this will remain reliable and informative.

No comments:

Post a Comment