Administering a Distributed Intrusion Detection System
Johannes B. Ullrich and Wayne Larmon
Intrusion detection systems have become ubiquitous. Due to the existence of
cheap commercial (IDS) systems and relatively easy to use freeware, an increasing
number of small business users, and even home users, are collecting valuable
intrusion detection logs. In the same way that malware (malicious software)
has prospered under the collaborative effort of contributors, intrusion detection
systems can also benefit from a boost in efficiency through collaboration.
This article describes the efforts of Dshield.org (http://www.dshield.org)
to build a global, distributed intrusion detection system. Dshield.org assembles
and analyzes detection log data from networks all over the world. In all, DShield.org
processes one- to two-million events per day. DShields users include many
home users and some operators of large networks. This user base provides the
DShield database with a roughly representative sample of network activity in
the wild. DShield studies the incoming data for unusual activity, producing
a daily block list of sites that appear to be participating in Internet attacks,
and notifying ISPs that their systems may have been compromised. See the sidebar
for the principles of DShield.
The payoff of this system is already evident. Internet worms have been detected earlier, and the development of countermeasures has been accelerated by the early detection provided by DShield.org.
System Operation
DShield.org provides a number of clients, which are small programs
used to submit the logs. Typically, the client will parse the log at the users
machine and submit a log in a standardized format to DShield.org via email.
Incoming email messages are queued, parsed, and a number of reports are generated
showing current reported activity and trends over time.
A number of additional services that show the benefits of such a system are provided: a recommended block list of networks that are currently scanning other networks; the option to report sources of attacks to network operators; and the ability to find out whether other systems are being attacked by the same sources using the same methods.
The closest comparable service to Dshield.org is ARIS, operated by SecurityFocus
(http://aris.securityfocus.com/).
ARIS focuses on corporate users and offers extended reports and alerts as part
of its Predictor service. Some firewall and intrusion detection
companies also offer correlation software that providing similar functionality.
This software, if implemented, will correlate all events collected by participants.
However, participants are limited to organizations implementing this system
and do not provide a global view.
Reports
The Copernican Principle, known to astronomers and others, can
be applied to many phenomena. This principal assumes that ones position
as an observer in the universe is not special in any way, and therefore, ones
observations can be considered representative. However, this principle is not
considered valid for intrusion detection. Many threats target particular networks
or prefer some networks over others. Therefore, it is critical to compare ones
observations to others in order to draw the correct conclusions.
A distributed intrusion detection system (dIDS) provides an easy way of doing this. For example, DShield.org provides each user with customized reports detailing how many other users observed the same source or the same attack. In some instances, it will also link to more detailed descriptions of the nature of the attack. For the public, reports are generated showing the overall distribution of attacks by origin and type.
Two products of the DShield.org a proactive block list, and reports
sent to administrators of systems implicated by reports (Fightback)
will be explained in more detail below. Most reports are consolidated
at the Internet Storm Center (http://isc.dshield.org),
which provides cross-linked reports to quickly analyze current events.
Geographic Attack Distribution
Early on, one problem was finding the geographic distribution of attack sources.
It is an often-voiced opinion that Asia is contributing to attacks more than
other geographic areas. This assumption is mostly validated by data collected
so far. However, it appears that the number of attacks originating in Asia is
as much a factor of vulnerable systems in these countries being used as relays
as it is a factor of users in these areas instigating attacks. DShield.org recently
started collaborating with the Korean Computer Emergency Response Coordination
Center (KRCERTCC). A daily data feed summarizing the attacks that originate
in Korea is used by the KRCERTCC to track an Internet Service Providers
(ISP) performance over time.
Attacks by Target Port
A first indication of the nature of an attack is the port targeted. While
it is not conclusive to use the target port to identify an attack, shifts in
port targets are indicative of shifts in attacks used to target networks. In
several cases, these shifts have shown early in the outbreak of new attacks.
Figure 1 shows the increase in port
80 reports as a result of the Code Red outbreak in July 2001. As early as July
13th, DShield.org provided an indication of a significant increase in port 80
attacks.
DShield developed a process to further follow up on this report. If a significant increase is detected, the user who originally submitted the report is contacted. Further information provided by these users (i.e., full packet logs or statements regarding network configuration) is analyzed. If there is reason to suspect a new attack, we attempt to capture the responsible code and issue a warning. This approach maintains the agility of the system, which is based on limited header information, and enables us to back up an alert with additional data, if necessary.
Attack Persistence
As an attacker scans large network blocks, a single target will not be able
to ascertain whether the attack against it was a single slip (e.g.,
a user typing a wrong IP address), a targeted attack, or part of a widespread
hunt for vulnerable systems. A collaborative system like DShield, however, can
follow an attack source as it scans multiple networks. Figure
2 shows a graph of the persistence of attacks. The plot shows the time between
the first and last attack reported to DShield.
Interestingly, the distribution can be explained by a statistical fit using the assumption that 99.5% of the systems are taken offline after an average of five hours (half life time) and the remaining 0.5% will remain scanning for an average of five days. The function used for this fit is a sum of two exponential decays:
A (r1 *exp (- x*ln(2)/h1) + r2 * exp ( -x*ln(2)/h2) )
where A is the total number of infected machines in the beginning, r1 and r2 are the fraction, which are part of the slow and fast component (r1=0.95, r2=0.05), h1 and h2 are the half-life time, after which 1/2 of the infected machines are fixed (h1=five hours, h2=five days). (Later, we will describe our fightback program, which attempts to improve this ratio.)
Proactive Block List
A common use of intrusion detection systems is to assemble a list of blocked
or banned IP addresses. For example, if an IDS monitoring a public
Web server, which cannot block port 80 for incoming traffic globally, detects
a large number of http intrusion attempts from a given network, it may decide
to block future access to its system from this network. However, such a block
can only be implemented after the scan is detected, which is usually too late.
Using a dIDS allows users to learn from attacks detected by others and build a proactive consensus-based block list. This list will include networks that have a recent history of being abused as attack sources. A regularly updated list allows network administrators to maximize the accessibility of their networks. Instead of blocking large IP blocks, they could focus on smaller networks based on evidence collected by others. Widespread implementation of such a block list may also force listed networks to become more proactive in eliminating malicious activity from the networks.
Currently, DShield generates a daily block list. It lists the top 20 attack
sources for the previous three days. Instead of focusing on individual IP addresses,
the list summarizes class C networks. The list also includes a number of reserved
addresses, which are frequently used to spoof sources in a Distributed Denial
of Service (DDoS) attack. The list is available via http and https at http://feeds.dshield.org/block.txt
or https://secure.dshield.org/feeds/block.txt.
The format is a simple tab-delimited format, which eases parsing by automated
scripts. A PGP signature is provided at http://feeds.dshield.org/block.txt.asc.
(See http://www.dshield.org/block_list_info.html
for current information on using the block list.)
DShield.org manually reviews this block list and notifies the networks that are on the list so that they can preclude this behavior. The primary criteria for inclusion is the number of targets that have reported attacks from a listed network over the previous three days. The total number of different targets (rather than the total number of accesses) is a better indicator of danger because the attacks are attempting to infect or exploit a large number of machines. Even after a network has been added to the blocked IP list, attacks will continue. When implemented, this list can prevent users from being affected by the attacks.
As an example, DShield.org includes a script to generate iptables rules using this blocklist. Writing a script to automatically generate iptables rules from an online-retrieved list like this poses a number of challenges. First, the script requires root privileges in order to run. Second, you must carefully validate the retrieved content to avoid running afoul of altered scripts that may include wrong information intended to block access to valid users.
While using Perls taint mode is a minimum requirement, the script also requires the use of digital signatures to validate the content. The sample script below uses the PGP signature provided by DShield. It assumes that the necessary keys are already present in the executing users keyring. An alternative and simpler method is to utilize https, but many users do not have an https-capable version of the Perl LWP module installed, and it is easier to install Gnu Privacy Guard (GnuPG).
The script generates a separate chain called BLOCKLIST. Using a new chain instead of adding the rule to an existing chain will ease maintenance and lessen the probability of its interfering with existing rules. The BLOCKLIST should be called from INPUT or FORWARD chains. A possible setup would look like this:
# allow trusted sources, which we never
# want to lock out iptables -A INPUT -s
# (...trusted ip...)
# (..further restrictions, e.g. port..) -j
# ALLOW call BLOCKLIST iptables -A INPUT
# -j BLOCKLIST
# execute remainder of firewall rules
# iptables -A INPUT ....
The same sequence can be used for other chains, like forward chains. The Perl
script in Listing 1 will retrieve the block list and add the rules to the BLOCKLIST.
The relevant PGP public keys can be found at http://www.dshield.org/dshield_public_key.txt.
You may want to define a small chain to log blocked accesses distinctively.
For example, use a chain like:
$IPTABLES -N LOGBLOCK
$IPTABLES -A LOGBLOCK -j LOG log-level \
warning log-prefix "filter:
BLOCKLIST " $IPTABLES -A LOGBLOCK -j DROP
To use this new custom chain, change the following in Listing 1:
my $blocktarget='DROP'
to read:
my $blocktarget='LOGBLOCK'
Eliminating Attacks
It is important to notify administrators that machines under their control
have been accessing other machines in a hostile manner. Administrators can then
investigate the suspected machine to determine whether the accesses were caused
by a user performing cracking activity or, more likely, by a compromised machine
that is attempting to compromise other machines. The vast increase (compared
to the days when only professional administrators maintained firewalls) in the
number of firewall users causes administrators to be deluged with an amplified
number of abuse reports.
One problem that can occur when individual users send abuse reports is that activity that could be considered hostile might actually have been caused by an innocent mistake, such as mistyping a URL. Therefore, if individual users send abuse reports, there is the danger that administrators will be flooded with abuse reports that are based on innocent mistakes.
A second potential problem caused by individual users submitting abuse reports is that most residential users of personal firewalls are not trained in security. Consequently, they may not know how to differentiate true hostile activity from normal network activity, such as DHCP (Dynamic Host Control Protocol) authentication.
A third problem is the lack of standardization for abuse reporting when individuals submit abuse reports. Each one is different, meaning that administrators receiving these reports must spend additional time studying them to determine whether the data is significant. A standard format abuse report allows administrators to quickly scan for relevant information. An even better solution would be to eliminate sending abuse reports by email altogether and replace them with more efficient summary reports tailored to an administrators needs.
DShield attempts to alleviate these problems by encouraging its users to let DShield send the abuse reports. DShield-generated Fightback abuse reports are only sent after a summary report from their database showing that accesses from a given source IP fit certain criteria. These are:
- From a port that we consider indicative of suspicious activity
- Have been logged by a minimum number of separate target machines
- Havent been sent to the administrator for this IP in the past month
- At least one of the submitting users agreed to have its reports forwarded
If, and only if, these criteria are met will DShield send a FightBack abuse report to the administrator of the network that controls the implicated source IP. The abuse report summarizes the suspected hostile access activity, giving log samples that show details of the suspected hostile accesses. A coded link is provided to a custom report describing the incident and showing all accesses linked to this source IP. This report includes accesses submitted to our database after the abuse message was sent, so that a concerned administrator can periodically check the database to see whether this machine has truly ceased the hostile activity.
For large networks or ISPs, DShield provides custom bulk abuse reports as an alternative to individual email abuse reports. These are worked out on a case-by-case basis with the network administrators.
Future Plans
With the potential of more users applying firewall rules and disabling unneeded
services, intrusion attempts are more likely to focus on the few remaining critical
business services still commonly exposed to the outside. As is already happening,
more information will be required to distinguish different types of attacks.
In the immediate future, collection of full packet content is planned from some
users. This will shorten our response time, as we will have full packets for
further real-time analysis.
Summary
Host- and network-based intrusion detection should be part of every administrators
defense of a network against targeted attacks. While individual IDSs are frequently
criticized as being reactive and more useful for forensics instead of defense,
joining them with a large-scale dIDS (such as DShield.org) will make them part
of a proactive weapon in the administrators arsenal.
Acknowledgements
DShield.org is currently supported by the SANS Institute. We would like to
thank the numerous contributors and current as well as past cooperators. In
particular, wed like to thank Alan Paller, Stephen Northcutt, John Green,
and Matt Fearnow for continued support of our activities.
Johannes Ullrich started DShield.org in November 2000. He joined the SANS Institute as CTO for the SANS Institutes Internet Storm Center in July 2001. Before that, he was employed as Lead Support Engineer by Banta Integrated Media.
Wayne Larmon is a computer consultant with more than 20 years of programming experience. He joined DShield.org shortly after its inception as a volunteer and is now a consultant responsible for client development.