Technical Challenges Faced by Vulnerability Scanners
The barriers to automation described previously lead to a number of specific technical challenges that must be addressed in the creation of an effective vulnerability scanner. These challenges impinge not only upon the scanner’s ability to detect specific types of vulnerability, as already described, but also upon its ability to perform the core tasks of mapping the application’s content and probing for defects.
Authentication and Session Handling
The scanner must be able to work with the authentication and session handling mechanisms used by different applications. Frequently, the majority of an application’s functionality can only be accessed using an authenticated session, and a scanner that fails to operate using such a session will miss many detectable flaws.
In current scanners, the authentication part of this problem is addressed by allowing the user of the scanner to provide a login script or to walk through the authentication process using a built-in browser, enabling the scanner to observe the specific steps involved in obtaining an authenticated session.
The session-handling part of the challenge is less straightforward to address and comprises the following two problems:
■ The scanner must be able to interact with whatever session-handling mechanism is used by the application. This may involve transmitting a session token in a cookie, in a hidden form field, or within the URL query string. Tokens may be static throughout the session or may change on a per-request basis, or the application may employ a different custom mechanism altogether.
■ The scanner must be able to detect when its session has ceased to be valid, and so return to the authentication stage to acquire a new one. This may occur for various reasons — for example, because the scanner has requested the logout function, or because the application has terminated the session as a result of the scanner performing some abnormal navigation or submitting some invalid input. The scanner must detect this both during its initial mapping exercises and during its subsequent probing for vulnerabilities. Different applications behave in very different ways when a session becomes invalid, and for a scanner that only analyzes the syntactic content of application responses, this may be a difficult challenge to meet in general, particularly if a nonstandard session handling mechanism is used.
Dangerous Effects
In many applications, running an unrestricted automated scan without any user guidance may be highly dangerous to the application and the data it contains. For example, a scanner may discover an administration page that contains functions to reset user passwords, delete accounts, and so on. If the scanner blindly requests every function, this may result in access being denied to all users of the application. Similarly, the scanner may discover a vulnerability that can be exploited to seriously corrupt the data held within the application. For example, in some SQL injection vulnerabilities, submitting standard SQL attack strings such as or 1=1– causes unforeseen operations to be performed on the application’s data. A human being who understands the purpose of a particular function may proceed with caution for this reason, but an automated scanner lacks this understanding.
Individuating Functionality
There are many situations in which a purely syntactic analysis of an application will fail to correctly identify its core set of individual functions:
■ Some applications contain a colossal quantity of content that embodies the same core set of functionality. For example, applications like eBay, MySpace, and Amazon contain literally millions of different application pages with different URLs and content, yet these correspond to a relatively small number of actual application functions.
■ Some applications may have no finite boundary when analyzed from a purely syntactic perspective. For example, a calendar application may allow users to navigate to any date. Similarly, some applications with a finite amount of content employ volatile URLs or request parameters to access the same content on different occasions, leading scanners to continue mapping indefinitely.
■ The scanner’s own actions may result in the appearance of seemingly new content. For example, submitting a form may cause a new link to appear in the application’s interface, and accessing the link may retrieve a further form that has the same behavior.
Other Challenges to Automation
Some applications implement defensive measures specifically designed to prevent them from being accessed by automated client programs. These measures include reactive session termination in the event of anomalous activity, and the use of CAPTCHAs and other controls designed to ensure that a human being is responsible for particular requests.
In general, the spidering function of the scanner faces the same challenges as web application spiders more generally, such as customized “not found” responses and the ability to interpret client-side code. Many applications implement fine-grained validation over particular items of input — for example, the fields on a user registration form. If the spider populates the form with invalid input, and is unable to understand the error messages generated by the application, it may never proceed beyond this form to some important functions lying behind it.
Using a Vulnerability Scanner
In real-world situations, the effectiveness of using a vulnerability scanner depends hugely upon the application you are targeting. The inherent strengths and weaknesses that we have described impinge upon different applications in different ways, depending on the types of functionality and vulnerabilities which the applications contain.
Of the various kinds of vulnerability commonly found within web applications, automated scanners are inherently capable of discovering approximately half of these, where a standard attack string and signature exist. Within the subset of vulnerability types that scanners are able to detect, they do a good job of identifying individual cases, although they miss the more subtle and unusual instances of these. Overall, you may expect that running an automated scan will identify some but not all of the low-hanging fruit within a typical application.
If you are a novice, or you are attacking a large application with limited time available, running an automated scan can bring clear benefits, because it will quickly identify several leads for further manual investigation, enabling you to get an initial handle on the security posture of the application and the types of flaws that exist. It will also provide you with a useful overview of the target application and highlight any unusual areas that warrant further detailed attention.
If you are an expert at attacking web applications, and are serious about finding as many vulnerabilities as possible within your target, you will be all too aware of the inherent limitations of vulnerability scanners, and will not fully trust them to completely cover any individual category of vulnerability. While the results of a scan will be interesting and prompt manual investigation of specific issues, you will typically want to perform a full manual test of every area of the application for every type of vulnerability, in order to satisfy your-self that the job has been done properly.
In any situation where you employ a vulnerability scanner, there are some key points to keep in mind to ensure that you make the most effective use of it:
■ Be aware of the kinds of vulnerabilities that scanners can detect and those that they cannot.
■ Be familiar with your scanner’s functionality, and know how to leverage its configuration to be the most effective against a given application.
■ Familiarize yourself with the target application before running your scanner, so that you can make the most effective use of it.
■ Be aware of the risks associated with spidering powerful functionality and automatically probing for dangerous bugs.
■ Always manually confirm any potential vulnerabilities reported by the scanner.
■ Be aware that scanners are extremely noisy and leave a significant foot-print in the logs of the server and any IDS defenses. Do not use a scanner if you are aiming to be stealthy.