Predictable Tokens
Some session tokens do not contain any meaningful data associating them with a particular user but are nevertheless guessable because they contain sequences or patterns that allow an attacker to extrapolate from a sample of tokens to find other valid tokens recently issued by the application. Even if the extrapolation involves an amount of trial and error, this will still enable an automated attack to identify large numbers of valid tokens in a relatively short period of time.
Vulnerabilities relating to predictable token generation may be much easier to discover in commercial implementations of session management, such as web servers or web application platforms, than they are in bespoke applications. When you are remotely targeting a bespoke session management mechanism, your sample of issued tokens may be restricted by the capacity of the server, the activity of other users, your bandwidth, network latency, and so on. In a laboratory environment, however, you can quickly create millions of sample tokens, all precisely sequenced and time-stamped, and can eliminate interference caused by other users.
In the simplest and most brazenly vulnerable cases, an application may use a simple sequential number as the session token. In this case, you only need to obtain a sample of two or three tokens before launching an attack that will capture 100% of currently valid sessions very quickly.
Figure -1 shows Burp Intruder being used to cycle the last two digits of a sequential session token to find values where the session is still active and can be hijacked. The length of the server’s response is here a reliable indicator that a valid session has been found.
Figure -1: An attack to discover valid sessions where the session token is predictable
In other cases, an application’s tokens may contain more elaborate sequences that take some effort to discover. The types of potential variations one might encounter here are open ended, but the authors’ experience in the field indicates that predictable session tokens commonly arise from three different sources:
■ Concealed sequences
■ Time dependency
■ Weak random number generation
Concealed Sequences
It is common to encounter session tokens that cannot be trivially predicted when analyzed in their raw form but that contain sequences that reveal them selves when the tokens are suitably decoded or unpacked.
Consider the following series of values, which form one component of a structured session token:
lwjVJA
Ls3Ajg
xpKr+A
XleXYg
9hyCzA
jeFuNg
JaZZoA
No immediate pattern is discernible; however, a cursory inspection indicates that the tokens may contain Base64-encoded data — in addition to the mixed-case alphabetical and numeric characters, there is a + character, which is also valid in a Base64-encoded string. Running the tokens through a Base64 decoder reveals the following:
–Õ$
.ÍÀŽ
Æ’«ø
^W-b
ö‚Ì
?án6
%¦Y
These strings appear to be gibberish and also contain nonprinting characters. This normally indicates that you are dealing with binary data rather than ASCII text. Rendering the decoded data as hexadecimal numbers gives you:
9708D524
2ECDC08E
C692ABF8
5E579762
F61C82CC
8DE16E36
25A659A0
There is still no visible pattern. However, if you subtract each number from the previous one, you arrive at the following:
FF97C4EB6A
97C4EB6A
FF97C4EB6A
97C4EB6A
FF97C4EB6A
FF97C4EB6A
which immediately reveals the concealed pattern. The algorithm used to generate tokens adds 0x97C4EB6A to the previous value, truncates the result to a 32-bit number, and Base64-encodes this binary data to allow it to be transported using the text-based protocol HTTP. Using this knowledge, you can easily write a script to produce the series of tokens that the server will next produce, and the series that it produced prior to the captured sample.
Time Dependency
Some web servers and applications employ algorithms for generating session tokens that use the time of generation as an input to the token’s value. If insufficient other entropy is incorporated into the algorithm, then you may be able
to predict other users’ tokens. Although any given sequence of tokens on its own may appear to be completely random, the same sequence coupled with information about the time at which each token was generated may contain a discernible pattern. In a busy application, with large numbers of sessions being created per second, a scripted attack may succeed in identifying large numbers of other users’ tokens. When testing the web application of an online retailer, the authors encountered the following sequence of session tokens:
3124538-1172764258718
3124539-1172764259062
3124540-1172764259281
3124541-1172764259734
3124542-1172764260046
3124543-1172764260156
3124544-1172764260296
3124545-1172764260421
3124546-1172764260812
3124547-1172764260890
Each token is clearly composed of two separate numeric components. The first number follows a simple incrementing sequence and is trivial to predict. The second number is increasing by a varying amount each time. Calculating the differences between its value in each successive token reveals the following:
344
219
453
312
110
140
125
391
78
The sequence does not appear to contain a reliably predictable pattern; how ever, it would clearly be possible to brute force the relevant number range in an automated attack to discover valid values in the sequence. Before attempting this attack, however, we wait a few minutes and gather a further sequence of tokens:
3124553-1172764800468
3124554-1172764800609
3124555-1172764801109
3124556-1172764801406
3124557-1172764801703
3124558-1172764802125
3124559-1172764802500
3124560-1172764802656
3124561-1172764803125
3124562-1172764803562
Comparing this second sequence of tokens with the first, two points are immediately obvious:
■ The first numeric sequence continues to progress incrementally; however, five values have been skipped since the end of our first sequence. This is presumably because the missing values have been issued to other users, who logged into the application in the window between the two tests.
■ The second numeric sequence continues to progress by similar intervals as before; however, the first value we obtain is a massive 539,578 greater than the previous value.
This second observation immediately alerts us to the role played by time in generating session tokens. Apparently, only five tokens have been issued between the two token-grabbing exercises. However, a period of approximately 10 minutes has also elapsed. The most likely explanation is that the second number is time-dependent and is probably a simple count of milliseconds.
Indeed, our hunch is correct, and in a subsequent phase of our testing we perform a code review, which reveals the following token-generation algorithm:
String sessId = Integer.toString(s_SessionIndex++) +
“-“ +
System.currentTimeMillis();
Given our analysis of how tokens are created, it is straightforward to construct a scripted attack to harvest the session tokens that the application issues to other users:
■ We continue polling the server to obtain new session tokens in quick succession.
■ We monitor the increments in the first number. When this increases by more than one, we know that a token has been issued to another user.
■ When a token has been issued to another user, we know the upper and lower bounds of the second number that was issued to them, because we possess the tokens that were issued immediately before and after theirs. Because we are obtaining new session tokens frequently, the range between these bounds will typically consist of only a few hundred values.
■ Each time a token is issued to another user, we launch a brute-force attack to iterate through each number in the range, appending this to the missing incremental number that we know was issued to the other user. We attempt to access a protected page using each token we construct, until the attempt succeeds and we have compromised the user’s session.
■ Running this scripted attack continuously will enable us to capture the session token of every other application user. When an administrative user logs in, we will fully compromise the entire application.
Weak Random Number Generation
Very little that occurs inside a computer is random. Therefore, when randomness is required for some purpose, software uses various techniques to generate numbers in a pseudo-random manner. Some of the algorithms used produce sequences that appear to be stochastic and manifest an even spread across the range of possible values, but can nevertheless be extrapolated for wards or backwards with perfect accuracy by anyone who obtains a small sample of values.
When a predictable pseudo-random number generator is used for producing session tokens, the resulting tokens are vulnerable to sequencing by an attacker.
Jetty is a popular web server written in 100% Java, which provides a session management mechanism for use by applications running on it. In 2006, Chris Anley of NGSSoftware discovered that the mechanism was vulnerable to a session token prediction attack. The server used the Java API java.util. Random to generate session tokens. This implements a “linear congruential generator,” which generates the next number in the sequence as follows:
synchronized protected int next(int bits) {
seed = (seed * 0x5DEECE66DL + 0xBL) & ((1L << 48) – 1);
return (int)(seed >>> (48 – bits));
}
This algorithm in effect takes the last number generated, multiplies it by one constant, and adds another constant, to obtain the next number. The number is truncated to 48 bits, and the algorithm shifts the result to return the specific number of bits requested by the caller.
Knowing this algorithm and a single number generated by it, we can easily derive the sequence of numbers that the algorithm will generate next, and also derive the sequence that it generated previously. This means that an attacker who obtains a single session token from the server can obtain the tokens of all current and future sessions.
Full-Blown Tests for Randomness
Due to the importance of robust session token generation, performing an effective attack against a security-critical application such as an online bank may require carrying out a full-blown methodology to test the randomness of its tokens. If you do not have access to source code, this will be a black-box exercise.
NEXT is..Weaknesses in Session Token Handling.,.,.,.,.,.,.