This document describes the design and implementation of a network worm simulation system, named "NWS". It also describes using the NWS system to model aspects of real world worms, and the results of those simulations.
Currently, it seems unfashionable to strictly define "worm" or "virus" or any other perfectly good term. I think that commercial interests drive this - it's simpler and more profitable to threaten non-technical people with a "worm" than it is to address the real problems of a software and hardware monoculture. For purposes of defining the scope of NWS, I wanted to define "network worm".
For the purposes of this document a "worm" consists of a process or set of processes that replicates without human intervention by creating an executing copy of itself on another computer via some form of network communication.
Other similar definitions exist.
The Jargon File entry for worm says[1]: A program that propagates itself over a network, reproducing itself as it goes.
Alexander Bartolich[2] of the ELF virus writing HOW-TO has this definition: A worm is a program that penetrates other running programs. Penetration means to copy the executable code of the worm into the active process image of the host.
The definition seems to describe other forms of "mobile code" like Java applets. One has to realize that a worm spreads itself - a user has to ask to download a Java applet, so applets don't count as worms. Email viruses like Klez or SirCam also require human intervention. Before the virus activates, the user has to at least open the email carrying the virus, and possibly even open the attached document that contains virus code. The typical mass-mailer virus doesn't qualify as a worm by the definition above, despite the "worm" names given to mass-mailing viruses by anti-virus companies and mass media. Cliff Changchun Zou's work on email virus modeling[3] shows the differences between mass-mailer viruses and network worms.
The clause that says "creating an executing copy of itself ... via some form of network communication" allows us to distinguish between a fork bomb[4] and a worm. A fork bomb consists of many replicating processes on a single machine, without recourse to network communication.
Various reasons exist for simulating the spread of network worms:
My own personal motivation consists of puzzling over the "installed base" theory. I usually encounter this theory in discussion forums like Usenet or Slashdot[8]. The theory runs like this:
Microsoft Product X has more worms than open source Product Y because Product X runs on more machines than Product Y.
I originally wrote the NWS simulator to try to test the "installed base" theory. In the course of writing NWS, I've decided that the "installed base" theory doesn't have any meaning, for at least two reasons.
The installed base theory amounts to special pleading for the use of Microsoft products.
Staniford, Paxson and Weaver[10], give sparse details on simulating a "Warhol worm".
Moore, Shannon, Voelker and Savage[11] seem to have modeled the Code Red epidemic with some effects of containment measures. They do not explain their simulator at all.
Zou, Gong and Towsley [12] modeled Code Red propagation, in an attempt to better match observed behavior. Apparently, they did their modeling using numerical solutions to differential equations.
Michael Liljenstam has written SSF.App.Worm[13], a module for the Scalable Simulation Framework[14].
Kephart and White [15] wrote a simulation of email virus spread. The simulator models networks with limited connectivity. They didn't fall prey to referring to such viruses as "worms".
Zou, Towsley and Gong [3] modeled email virus propagation using numerical simulations. They also make a distinction between "network worm" and "email virus".
NWS provides a Perl programmer with a framework in which to write a complete network worm simulation. The programmer writes a Perl program using NWS objects and their methods. Running the program executes the simulation. NWS does require a degree of Perl knowledge, especially to write the worm code. It does not allow a user to eschew all intelligence by limiting him or her to actions or categories of actions I've already thought of.
You can download the NWS simulator and try the simulations described in this document or use NWS to write your own. I developed NWS in a Linux command line environment, so the simulations print to STDOUT. Windows users may have a hard time getting these simulations to work, but they should work unchanged on all modern Unix, Linux or BSD systems that have Perl installed.
The NWS network worm simulator is licensed under the GNU General Public License[16] (GPL).
I wanted to write a system that allowed simulation of many different real-life worms. When I began, I did not understand what this entailed. At a minimum, I believed that I needed flexible configuration and setup, to allow trying different population sizes (including number and "species" of worms) and address space sizes. I also chose to have NWS actually execute "worm code" in each entity a worm infects. Actually executing code allows a worm to perform arbitrary actions just like in real life. I wanted to make any given simulation as believable as possible.
Between setup and configuration and executing code for a worm, any simulator needed to include some kind of interpreted language. I chose Perl, mainly because I had an easy time writing early versions of an SIS epidemic in Perl. I wrote the NWS framework itself in the same language that NWS users write worms.
I also chose to write NWS in a simplistic object oriented fashion. Objects in an NWS program directly correspond to real life entities that comprise a real life network: hosts, software, and messages. This should make simulations much more believable, since a user can directly count uninfected hosts or worm hosts or messages passed.
Petri nets, or finite state machines or numerical simulations could simulate any given worm, but not allow the curious investigator to make simulations of some other type of worm, or more than one strain of worm in a simulation.
Numerical simulations don't easily allow discontinuities like dormant stages in a worm's life cycle, or introductions of counter-worms after worm infections reach epidemic proportions.
One can write a simulator for any given worm without executing "worm code", but that simulator wouldn't have any kind of general purpose capability.
Real life network worms execute arbitrary code once they've replicated themselves on a new host. The only way to model this generality allows simulated worms the same ability: simulated worms must execute some form of code. The NWS framework allows and encourages this.
Simulators that operate without executing worm code will end up corroborating specific cases written for and executed in a general purpose worm simulator.
NWS simulations model fully-connected networks, using "network" in a more graph theoretic sense rather than a TCP/IP sense. Every network-connected entity can send a message to any other entity in the simulation. Each network-connected entity has an address consisting of a single number. Writing a simulation requires an NWS user to set the address space size, the largest number that gets used as an address.
In contrast, mass-mailing Outlook viruses don't operate in a fully-connected network. Many mass-mailing Outlook viruses use Outlook's address book or SMTP client, while a few look through the infected machine's file system to find email addresses. Modeling a mass-mailer virus would involve deciding how many email addresses to allot to each instance of the virus, and deciding on a probability that represents how often a virus recipient clicks on the virus attachment. Not every instance of a mass-mailer virus would send itself to all the email addresses in the simulation, so the network of email addresses would not qualify as fully-connected.
Assuming fully-connected networks seems overly optimistic relative to the real Internet, given work such as Arbor Networks paper on dark address space [17] and the prevalence of corporate intranets connected to the Internet via gateways and firewalls that do network address translation.
Naively, a network links entities called hosts. A host can contain instances of software that the host executes periodically. The network allows hosts to communicate with each other via messages or streams of messages.
The NWS framework demands that its users create objects
that correspond directly
to entities in a real network. NWS users have to create an instance
of class NWS::Network
,
and populate the instance of
NWS::Network
with instances
of class NWS::Host
.
NWS::Host
instances contain
NWS::Software
instances.
Due to CPU architecture or
operating system executing on the host, or even due to particular
versions of software
executing under the operating system, real life worms can only replicate
themselves on certain hosts.
Users of NWS can create instances of class NWS::Host
that
specify CPU architecture or operating system name. Instances of NWS::Host
can contain instances of class NWS::Software
that specifically name the
"software" that the Host executes.
When NWS::Host
instances receive
NWS::Message
instances,
the NWS::Host
delivers NWS::Message
instances to
NWS::Software
instances. NWS::Host
instances have numerical
addresses in a network, analogous to Internet Protocol addresses.
Inside an NWS::Host
, names of NWS::Software
instances
constitute the analog to TCP or UDP port numbers.
An NWS::Message
gets delivered to a host by numerical address,
but the Message
gets
ignored if the Message
instance does not contain an
identifier that matches the identifier of the Host
instance receiving it.
After a match on Host
identifier, the Host
will ignore the
Message
instance
if the Message
does not name a Software
instance the
Host
contains.
This style probably has some detrimental effect on execution speed. I hope that this style promotes more believable simulations.
Users of NWS write simulation programs in Perl, incorporating and using NWS as a set of library modules. Users also write worm code for their simulations in Perl. The same programming language gets used for setup, configuration and for subjects under study in the simulation.
One possible alternative would have the simulator framework (network, network communications, execution scheduling, etc) written in a traditional compiled language like C. The worm code would exist as fragments of a scripting language, probably an easily embeddable scripting language like Tcl[18], Lua[19] or S-lang[20]. Other general simulators, like the ns[21] network simulator, have implemented exactly the "interpreted setup and configuration, compiled back-end" design. I decided against this approach.
Experience developing various worm simulations convinces me that NWS users will modify the simulator framework to suit their purposes. I often modified the simulator framework to serve special purposes like counting message receptions, or testing different message delivery algorithms. Writing NWS with a traditional compiled language "back end", and interpreted user code for setup, configuration and worm code burden one set of users, and handicap another. Some users will want to obtain different information from a simulation than the NWS framework provides by default. These users would have to change the "back end" code to collect the information they want while they simultaneously write the simulation in question. These users would have to write simulations in two vastly different programming languages at the same time - the "back end" and the "front end" languages. This two-language design would unnecessarily limit other users who only understand the interpreted user code.
Many people have experience with Perl: Perl implementations exist for many or all modern systems. Lots of web sites and books offer documentation on Perl and how to effectively write Perl programs. This means that the potential audience of users of NWS ends up vastly greater than an alternative simulator design that requires its users to have familiarity with at least two languages.
Using Perl for both worms and the simulator library code can make debugging difficult. NWS users may find it difficult to distinguish errors in worm code from errors in library code. Perl programs can use substantial amounts of memory, and they can execute slower than one would like. Users of NWS will never have the ability to model the entire Internet, as did Staniford, Paxson and Weaver[10].
Any simulation discards elements of the real system it models. This causes simulations to behave differently than real life systems. People using the NWS simulator should realize that their simulations differ from reality in at least these specific ways:
NWS::Software
instance.I think that this combination of differences from reality prevents one from making any kind of correspondence between time steps in a simulation, and real-world time. It does not prevent using the NWS framework to compare qualitative differences. For example, an NWS user could check relative efficiencies of two different ways to probe the simulated network's address space.
Before trying anything difficult or interesting I attempted to validate NWS code and behavior.
I wrote simple test harnesses to execute as much NWS code as possible, and verify the code's behavior.
The NWS framework code passes all these tests. Users who modify the framework for their own purposes should consider running these test cases (and maybe others) to ensure they haven't broken the framework.
The Susceptible-Infected-Susceptible model[22] constitutes one of the simplest mathematical models of an epidemic. Each entity in the population of the model resides in one of two states:
Entities makes transitions from one state to another according to some probability. Entities participating in biological epidemics can make transitions from susceptible to infected, and possibly from infected to susceptible, depending on the illness. Suffering through the common cold only confers a temporary resistance. Realistic models have to account for this.
If no transitions from Infected to Susceptible states take place, this model approximately matches any number of real life worms: the 1988 DECnet "Father Christmas" worm[23], 1998 ADMw0rm[24], the first arrival of Code Red v2[25], 2002 Slapper worm[26] (also here[27]).
To validate the NWS framework, I need some kind of standard against which to measure it. Fortunately, SIS epidemics have an analytical solution. I'll derive most of the solution below in terms of a network worm and a population of exploitable hosts.
I use the following symbols:
N | Number of entities in the population | |
S | Address space size | |
P | Probability of randomly picking an address that maps to an entity | |
V | Number of infected entities in population at some time | |
V0 | Number of infected entities in population at the start of the simulation |
Assume that the worms pick addresses at random. The probability of picking an address that maps to a host in the population under consideration amounts to:
P = (number of hosts)/(address space size) = N/S
.
The fraction of uninfected entities amounts to:
1 - V/N
At any given instant, V
number of worms try to infect
the remaining uninfected entities. Since the worms have a probability
of infecting
the remaining uninfected proportion of entities, and only so many
uninfected entities remain, the change in
the number of infected hosts amounts to:
P(1-V/N)V
A real set of infected hosts operating in the real Internet would send out probes continuously, so I get to magically wave my hands and convert the change in number of infected hosts to a differential equation:
dV/dt = P(1-V/N)V
This differential equation, known as the Logistic Equation, turns out to have an exact solution. The solution involves some onerous bookkeeping called integration by partial fractions. [28] The exact, analytical solution:
V = V0/[(1 - V0/N)e-Pt + V0/N]
This equation has some interesting properties. It has time starting at zero and going forwards. It works in terms of the total population, the initial number of infected hosts, and the probability of getting a "hit" on a member of the population.
If the Logistic Equation's solution matches reality, an investigator could reasonably work out when a worm-writer released a worm. One would only have to make a few assumptions about population of exploitable hosts, address space size, and how many worms got released initially.
Staniford, Paxson and Weaver's[10]
"Hit-list Scanning" ends up as nothing more than a large value of V0
in this form of the solution to the Logistic Equation.
I used an address space of 65535 (16 bits of address), and I populated
it with seven
worm NWS::Host
instance,
and 9993 possible victim NWS::Host
instances,
for a total of 10000 entities in the population.
All entities reside at randomly assigned addresses in the address space.
The simulation runs for 135 time steps or until only 1% of
the 9993 possible victims remain uninfected.
I will clarify the choice of seven initially infected entities below.
The NWS simulation doesn't match the analytical solution particularly well. I believe the difference arises from stepwise treatment of time, and a discrete population. The NWS simulation takes discrete time steps, while the analytical solution has continuous time. Infections happen in whole numbers of entities per time step. The NWS model can only show 1 or 3 or 97 entities infected at a given time step, not 1.237 or 3.532 or 97.8 entities infected at 10.346 hours. NWS has inefficiencies (it takes longer to get to 99% infection) relative to analytical solution for these reasons.
The analytical solution to the Logistic Equation provides an upper boundary to the NWS SIS simulation. I tried a cheap-and-dirty variant of Euler's Method[29] of solving differential equations to provide a lower boundary. I made the variant round down to the next lowest integer value of the number of entities infected over a given elapsed time.
Starting with the Logistic Equation:dV/dt = P(1-V/N)V
I'll make the differential dV/dt
into
a ratio of finite quantities,ΔV/Δt = P(1-V/N)V
If I use a small enough time step (numerical value of Δt
)
the Euler's method approximation should still remain accurate. Multiply both sides of
the equation by Δt
, and use int()
function to
mean "next lowest integer value":
ΔV = int(P(1-V/N)VΔt)
Euler's method has you calculate a value of ΔV
based on the current value
of V
, P
and Δt
.
The algorithm adds ΔV
to the current value of V
to get
the next time step's value of V
, an iterative solution:
Vi+1 = Vi + int(P(1-Vi/N)ViΔt
This necessarily undershoots the analytical solution for two reason.
First, rounding down to the next lowest integer value of Δt
provides the biggest loss relative to the analytical solution.
Second, the iterative solution uses the slope of the function at the
start of the time step to shoot the value of the function at the end of
the time step. The analytical solution shows us that the curve
only goes up, it never goes down, and the curve is concave
for the first half of the simulation's run time.
Using the slope at the start of the time step under estimates
the value of the function at the end of the time step for the
concave part of the curve.
The Euler's method solution graphed below started with
values of V
= 7, N
= 65535,
P
= 10000/65535 = 0.1525
and Δt
= 1 which match the NWS simulation.
I chose the initial value of V
so that the value of ΔV
comes out to 1 during the first time step.
The values of P
and 1-V/N
during the first time step mean that any smaller initial value of V
will end up with
ΔV
= 0.
This Euler's method simulation breaks down for initial values of V
less than 7.
I ran the SIS simulation 20 times. I reduced the 20 simulation outputs to an arithmetic mean of the number of infected hosts at each time step. The graph below compares the mean of 20 SIS simulation runs with the analytical solution, and the results of running the discrete Euler's method numerical approximation derived above.
The mean of 20 runs of the NWS SIS simulation reaches 99% infection slower than the analytical solution would have it, and falls reasonably close to the discrete Euler's Method numerical simulation. I take this to mean that the NWS SIS model correctly simulates a network worm turned loose in a population of computers without any resistance.
For small numbers of infected hosts, the real Internet shares the "discrete population" property with an NWS simulation. This seems to imply that an investigator can't use the analytical solution to the Logistic Equation to accurately find the time of an initial infection of a real Internet worm.
A worm author can take away two, opposing conclusions. Start as many worms as possible (a large hit list) to eliminate statistical effects on when a worm population gets traction, or start a single worm to randomly confuse the issue of when the epidemic started.
Writing a slow, hard to use simulator that could only model worms that easier, faster methods can adequately model doesn't make sense. This section illustrates a few cases that more or less match real world worms, and don't have an analytical solution available.
Epidemiologists also consider another model of infection: Susceptible-Infected-Recovered[30]. Observation of biological epidemics like chicken pox or small pox, diseases that confer lifetime immunity on the survivors, motivate this model. In an SIR epidemic, members of the population reside in one of three states:
Members of the population make transitions from one state to another according to some probability. The SIR model limits transitions to
In computer terms, one might describe the states as vulnerable, exploited and patched.
During the 2001 Code Red worm outbreak, Markus Kern posted code for a passively spreading worm[31]. I don't have much knowledge about IIS in particular, and Microsoft products in general, but I gather that when a CRclean-infected instance of IIS receives a Code Red URI, CRclean sends a copy of itself to the Code Red-infected-host. The newly-sent-CRclean infects the Code Red-infected-IIS with itself, stops Code Red and patches the previously-infected host to prevent re-infection by Code Red. CRclean then starts watching for receipt of Code Red URIs.
The interaction between IIS, Code Red and CRclean matches an SIR epidemic pretty well. We would consider machines running IIS (but not infected with Code Red or CRclean) as "susceptible". After infection with Code Red, we would consider a machine "infected", and after infection with CRclean, we would consider the machine "recovered".
Despite asserting above that this is an interesting case, one that doesn't have an analytical solution, I will derive differential equations to describe the IIS/Code Red/CRclean interaction. The differential equations that I derive don't have an analytical solution, I can only solve them numerically. Since numerical solutions to systems of differential equations are notoriously poorly behaved, deriving a system of differential equations for this case that may or may not even converge on an answer doesn't render it trivial, suitable only for validating the NWS framework. Differential equations, solvable or not, also give insight into the problems they describe that simulations do not.
I'll use the following symbols to derive differential equations that describe an epidemic of Code Red among a population of IIS hosts, and automatic patching of the infected hosts by CRclean.
N | Number of entities in the population | |
P | Probability of picking an address that maps to an entity | |
S | Number of susceptible (uninfected) entities in population at some time | |
I | Number of infected entities in population at some time | |
R | Number of recovered entities in population at some time |
First, the number of susceptible, infected and recovered entities has to sum up to the total number of entities in the population:
N
= S
+ I
+ R
Next, the differential equation for the rate of change in recovered
entities.
Infected entities (Code Red worms in this simulation)
probe the address space randomly, and they have a probability P
of picking an address mapping to an entity. Since I
infected
entities exist in the simulation at any given time, PI
represents the number of probes that hit any entities in the simulation.
The number of CRclean entities
equates to the number of recovered entities in this simulation.
The fraction of recovered entities in the simulation amounts to
R/N
. So the number of probes from Code Red entities
that hit a CRclean entity is PI(R/N)
.
Each Code Red probe to a CRclean entity results in one less Code Red entity,
PI(R/N)
amounts to the per-time step change in recovered entities, and
-PI(R/N)
amounts to the per-time step change in the number of infected entities.
Since these equations model a continuous system, we can wave a
magic wand over it and convert it to a differential equation for
representing the rate of change in number of recovered entities:
dR/dt = PI(R/N)
The change in number of infected entities has a similarity to the
SIS model: PI
constitutes the number of infectious probes
that hit an entity in the address space of the simulation. Since only
a fraction of the number of entities that get hit are susceptible,
the change in number of infected entities due to conversion of
susceptible entities
amounts to PI(S/N)
This SIR model differs from the SIS model in that some infected
entities change to recovered. That change amounts to
-PI(R/N)
.
We add the two changes in number of infected entities, and once again invoke the magic of continuous time and a large population to get a differential equation:
dI/dt = PI(S/N) - PI(R/N)
That leaves us with three equations for the three unknowns
(S
, I
, R
):
N
= S
+ I
+ R
dR/dt = PI(R/N)
dI/dt = PI(S/N) - PI(R/N)
I don't think that an analytical solution for the three equations
above exists. One can write a numerical simulation for them, however.
Assuming differentials constitute ratios of finite quantities and
that using small (but not infinitesimal) values of dt
approximates the analytical solution, you can write an iterative
solution.
Ri+1 = Ri + (PIiRi/N)Δt
Ii+1 = Ii + (PIiSi/N - PIiRi/N)Δt
Si+1 = N - Ii+1 - Ri+1
The values of S
, I
and R
at a given time step get used to calculate the values for the next time step.
To compare the NWS simulator with an Euler' method solution for an SIR epidemic,
I wrote and ran an NWS simulation of CRclean versus Code Red.
I used an address space of 65535 (16 bits), and I populated it with seven
NWS::Host
instances that run NWS worm code, modeling the start
of the Code Red 1 v2 worm.
I put in 9986 possible victim NWS::Host
instances, modeling
IIS hosts, all of them infectable by Code Red v2.
All worms and victims reside at randomly assigned addresses.
To make this simulation different from the SIS model above, I put in seven
NWS::Host
instances that contain code for a simulated CRclean worm.
This differs from reality. To my knowledge, no one deployed CRclean.
I used seven each of Code Red 1 v2 and CRclean to give the simulated Code Red worms a good probability of hitting a simulated IIS host during early time steps. This should reduce the differences between the Euler's Method numerical simulation and the NWS simulation.
The Euler's method numerical simulation starts with N = 65535
,
S = 9986
, I = 7
and R = 7
.
It sets the probability of picking a host at random to P = 10000/65535
.
It runs for 150 time steps.
The NWS simulation of CRclean matches the Euler's Method numerical solution very closely, far closer that the NWS SIS simulation matches an Euler's method numerical simulation. I believe this results from the fact that a single differential equation can describe an SIS epidemic, while an SIR epidemic requires a system of three differential equations. An NWS simulation of an SIS epidemic "leaks" relative to the Euler's Method solution, in that NWS can't infect fractional numbers of entities. Any time the Euler's Method simulation infects something other than a whole number of entities, at the corresponding time step the NWS simulator only infects the whole number of entities. Any fractional number of infected entities leaks out of the system, since the fractional part gets added to no other quantity. The three equations used to model an SIR epidemic don't "leak" fractional parts of entities out of the system: any fraction of an entity not included in one quantity gets included in another.
It looks like CRclean would have worked against Code Red, despite not doing any scanning on its own. More generally, it looks like a "passive" worm, a worm that merely responds to legitimate requests, instead of actively scanning for vulnerable hosts, would work, provided the population of vulnerable hosts act as clients to the passive worm's service.
Two Code Red worms[25] existed and competed for the same set of vulnerable Windows hosts running IIS. For a period in August and September of 2001, Code Red II out-infected Code Red V2. In this section, I use NWS to explore some possibilities of why one worm might out-compete another worm for the same population of infectable entities.
The case of two or more worms that use the same exploit has happened in the real Internet at least twice. Code Red I v2 and Code Red II in 2001, and the many "Slapper" variants[32] in 2003.
I won't address the different probing strategies used by Code Red I v2 and Code Red II - the NWS framework does not currently allow simulating sub-nets, so simulating Code Red II's preference for probing IP addresses close to its own doesn't make any sense. Some other properties can be addressed.
This set of simulations has an address space of size 65535, in which seven each of two, very similar worms get started. The simulation also contains 10,000 infectable hosts. For verisimilitude, I named the worms "Code Red I v2" and "Code Red II", and I named the susceptible hosts "IIS".
The simulation executes two very similar worms. The worms carry nearly identical code, differing in only two aspects. The "true name" that each worm gives to a newly-infected host differs, one set of worms uses "Code Red I v2", the second set uses "Code Red II". The two worms also differ in that "Code Red II" worms can send an extra probe to a random address some percentage of the time steps in which they execute. The "Code Red II" worm code uses a random number to decide whether to send the extra probe, so the percentage works out only on an aggregate basis - some worm instances may never send the extra probe, some instances may send it every time step. This extra probe simulates a slightly faster worm or a worm that communicates infection slightly faster.
I ran the simulation 20 times each with an extra-probe-percentage of 0, 1, 5 and 10. The following table show the mean number of infected hosts for each of the extra-probe-percentages.
0% | 1% | 5% | 10% | |
---|---|---|---|---|
Code Red I v2 | 5061.9 | 4408.95 | 2726.35 | 1273.5 |
Code Red II | 4951.1 | 5604.05 | 7286.65 | 8739.5 |
Even a small increase in number of probes per time step gives a population of worms a big advantage against a slightly slower population.
In this variant, "Code Red II" worm code marks its host as not vulnerable just after exploiting that host. This action prevents re-infection from other "Code Red II" hosts, and conversion to "Code Red 1 v2".
Worms with the true name "Code Red 1 v2" don't perform the extra
work of marking the newly-infected NWS::Software
instance not vulnerable. Other "Code Red 1 v2" worms can
re-infect it, and "Code Red II" can infect it.
Both worms probe at the same rate, once per time step.
The simulation has an address space of 65535. I had the simulation
place 10000 victim hosts randomly in the address space.
I had the victim hosts contain NWS::Software
instances
using the identifier and true name of "IIS".
At the top
end of the address space (address 65525) I had the simulation place
an NWS::Host
instance containing worm code that upon
infection, "fixes" the vulnerability it used to replicate itself.
I had this worm use the true name of "Code Red II".
At the bottom end of the address space (address 10) I had the
simulation place NWS::Host
instance containing worm code
identical to the first worm code, except that it doesn't fix the
vulnerability. Instances of this worm leave the vulnerability
unpatched.
I had this worm use the true name of "Code Red I v2".
I ran the simulation 20 times, each time executing 100 time steps, producing the data in the table below.
Worm true name | mean hosts infected per run | median hosts infected per run |
---|---|---|
Code Red I v2 | 64.1 | 35 |
Code Red II | 9888.8 | 9918 |
Fixing the vulnerability that initially gave access to the susceptible host confers an enormous advantage on a population of worms competing for a set of susceptible hosts that exhibit a given vulnerability.
For the case of two worms competing for the same population of susceptible hosts, worm authors should strive to write worms as "virulent" as possible.
Worms should patch the vulnerability they used to replicate themselves. It appears that all other things staying the same, having a worm fix the vulnerability it exploited to replicate gives more advantage than a 10% performance boost.
This simulation models the hypothetical Permutation scanning worm described in How to 0wn the Internet in your spare time[10], in a section titled "Permutation Scanning". Staniford, Paxson and Weaver wrote about a simulation of a permutation scanning worm, but I believe they missed some key points, developed below.
Permutation scanning worms don't select addresses to probe randomly: they select addresses algorithmically. Each worm has an index, and an algorithm that maps the index value to an address. By incrementing the index, they calculate different, non-sequential addresses to probe. If the worms use a good quality algorithm to calculate address, the addresses generated from successive indexes are quite different: the series of addresses probed looks very close to random. All worms use the same algorithm to generate addresses, but individual worms pick random initial index values at infection time.
A worm that starts late will likely pick an index in a range that an earlier worm has already probed. The late-created-worm will end up probing a series of addresses that some other worm has already probed. To avoid re-probing, worms that try to re-infect previously infected hosts go inactive in the Staniford/Paxson/Weaver scenario. The worm code in the permutation scanning simulation accomplishes this by replying to any re-infection attempts. Individual re-infecting worm instances go inactive when they receive a reply.
One can spoof worms that use this type of scanning. An immune spoofer host could respond that the worm has already infected it, causing permutation scanning worms that probe the spoofer to go inactive. This would protect all hosts further in the permutation than the spoofing host. Real life worms should require two or more consecutive "already infected" replies to go inactive to prevent spoofing. The simulated worm represents the best case of a permutation scanner, with respect to optimism about reinfection.
The permutation scanning worm of this simulation uses an algorithmic permutation to probe address space. A DES-inspired algorithm combines a 32-bit key with a 16-bit index to determine the address to probe. Worms start with a random index, then generate an address to probe by using the permutation function on the index. Worms increase the index sequentially. Hopefully, the series of addresses calculated by permuting sequential indexes look random. All worms use the same permutation by using the same 32-bit key.
The simulation runs in an address space of size 65535 (16 bits).
It starts with 7 permutation scanning worm and 9993 victims.
The victim NWS::Host
instances reside at randomly assigned addresses, as does the
single initial worm instance.
The simulation runs for 135 time steps or until only 1% of
the 9993 possible victims remain uninfected.
This allows for comparison to the SIS model random probing
worm described above. I ran the permutation scanning worm simulation
20 times, and calculated mean counts of infected
hosts at every fifth time step, shown in the graph below.
Permutation scanning worms and random probing worms take exactly the same number of time steps to reach 100% infection.
Permutation scanning produces fewer messages to get to 100% infection than a population of random probing worms does, and a smaller mean number of probes per host. Only active permutation scanning worms send messages, and the proportion of active permutation scanning worms rises, then drops to nearly zero by 100% infection. To contrast, every member a population of random probing worms produces messages all through the course of a simulation.
One area of difference exists: the distribution of probes over the population of entities in the simulation. When you count the probes per host, the population of permutation scanning worms produces a Zipfian distribution - almost 45% of the hosts in the simulation receive 2 probes, abut 26% receive 3 probes, 13% receive 4 probes and so on. The number of hosts receiving a certain number of probes amounts to almost exactly half the number of hosts that receive one more probe. The typical host receives far fewer scans for the optimistic permutation scanning worm, than for a random scanning worm.
The random probing worms produce a Poisson distribution of probes per host, exactly what you'd expect for a random process.
The property of not probing most hosts very often might make discerning a permutation scanning worm more difficult,
I conclude that permutation scanning doesn't give a worm an overall advantage relative to random probing. A permutation scanning worm doesn't have any speed to 100%-infected advantage. It has a lot of disadvantages. I found it difficult to write an algorithmic permutation of a 16-bit address space that gives a random look to the permuted indexes. Depending on the vulnerability and the communication protocol involved, firewalls or NAT can prevent infected hosts from responding to repeat infection attempts, negating any advantage conferred by lowering traffic - worms that should go inactive instead keep probing. Permutation scanning has trouble with NAT-ed subnets, too. Since a given set of worms all use the same permutation, a worm instance that obtains access to a NAT-ed subnet host will almost certainly skip over most or all of the other hosts on the NAT-ed subnet, at least until an address in the subnet turns up as a permuted index. I conclude that the Internet will probably never see a permutation scanning worm that becomes epidemic.
The 1i0n[33] Linux worm of 2001 prompted someone to write an anti-worm: Cheese[34]. Beyond obvious differences like CRclean attacking IIS and Cheese exploiting a back door TCP server left behind by the 1i0n worm, significant differences exist between them. One significant difference between Cheese and CRclean consists of the method of finding hosts infected with the original worm. Cheese actively sought out back door TCP servers left running by 1i0n worms. CRclean waits for a Code Red URI to arrive, and attempts to infect the sender of the Code Red URI. Does active searching for hosts infected by some other worm make a difference?
The same differential equations as describe the CRclean model
apply here. The derivation of the differential equations ends up
with different semantics for the change in recovered entities equation.
The Cheese worm actively probed for 1i0n back door servers. The probability
of a Cheese worm probing an address
that maps to a host remains P
, but the
number of Cheese worms is I
, and the fraction of the population
infected with 1i0n is I/N
. For Cheese, the
differential equation for change in number of recovered entities is:
dR/dt = PR(I/N)
The PRI/N
term equates algebraically to the PIR/N
in the CRclean system of differential equations.
The same numerical simulation can describe both CRclean/Code Red/IIS
interaction and the Cheese/1i0n/BIND interaction.
The differential equations give the insight that CRclean and Cheese
should observe fundamentally similar limits on spreading.
I used an address space of 65535 (16 bits), and I populated it with seven
worm NWS::Host
instances containing code representing the
initial number of 1i0n worms.
I also populated the address space with seven NWS::Host
instances containing code representing an initial number of Cheese worms.
I put in 9986 possible victim NWS::Host
instances modeling hosts
running BIND, the software exploited by 1i0n worms.
All NWS::Host
instances reside at randomly assigned addresses
in the simulation's address space.
For validation purposes, the simulation runs for 150 time steps and all Cheese worms and 1i0n worms get their start at time step 1.
This Cheese worm simulation matches the Euler's method numerical solution about as well as the CRclean worm simulation did.
The set of Cheese worms send out many more probes than the set of CRclean worms. CRclean worms only respond to Code Red probes, while Cheese worms actively probe for 1i0n back door software. The set of CRclean worms produces fewer and fewer response messages as the number of Code Red worms decreases, while the set of Cheese worms produce more and more probe messages as their population increases.
If you must write an anti-worm, write it in the "passive" style of CRclean, rather than the "active" style of Cheese. Passive worms produce less network traffic of their own.
The msblast[35] worm appeared in August of 2003. It took advantage of a buffer overflow in an RPC system that apparently existed in nearly every Windows NT-based operating system.
The msblast worm had an unusual method[36] for picking addresses to probe for susceptible hosts: each worm picked a random IP address upon infecting a host, then sent probes sequentially up from the random address.
Given that network administrators have allocated addresses inside the Internet Protocol address space in bands[37], perhaps the author of msblast thought that sequential probing would allow faster infection rates. I wrote an NWS simulation to test msblast's probing strategy against victim hosts placed in bands in the address space, and against victim hosts placed randomly throughout the address space.
The real question amounts to "what difference does an address probing strategy make if the addresses of susceptible hosts are arranged in groups rather than scattered randomly?" To test this, I wrote 4 different simulations:
In each of the four simulations,
I used an address space of 65535 (16 bits of address), and I populated
it with seven
worm NWS::Host
instance, and 9993 possible victim
NWS::Host
instances,
for a total of 10000 entities in the population.
In all four simulations, the initial seven worm entities get
randomly selected addresses.
In two of the simulations, the victims also get randomly selected addresses. In the other two simulations, the victims get placed sequentially in nine "bands", all bands one-ninth of the address space apart. Each band contains one-ninth of the victim hosts.
For one simulation each of random and banded placement of victim hosts, I modeled a random address probing worm, like Code Red v2, or Slapper. Each time it gets executed, a random address probing worm picks an address in the simulation's address space at random. It tries to infect that address.
In the second set of two simulations (banded placement of victims and random placement of victims), I modeled the msblast worm. The msblast worm code picks an address at random when it infects a previously susceptible host. Each time the worm code executes after that, it probes the address, then increments the address. Written an alternate way, it probes sequentially up the address space. I had the code wrap around to address 1 if and when it gets to the maximum address in the simulation.
The four simulations run for 135 time steps or until only 1% of the 9993 possible victims remain uninfected.
I ran the simulations 20 times each. I reduced the 20 simulation outputs to an arithmetic mean of the number of infected hosts at each time step. The graph below compares the mean of 20 simulation runs for each of the four simulations
A worm that probes an address space randomly reaches 100% infection just as fast for both victim host layouts, random and banded. The msblast-style probing worm performed differently for the two victim host layouts.
For a set of victims with randomly-allocated addresses, the random probing worm dramatically outperforms msblast-style sequential probing.
For a set of victims in nine bands, the random probing worm performs identically to the random probing worm with a set of randomly positioned victims. The msblast-style sequential probing performed almost as well.
I hypothesize that the author of the real msblast had access to a network with sequentially-allocated addresses for testing. The sequential probing msblast worm worked fairly well on the network used for testing, maybe even better than random probing, so the msblast author chose an address selection algorithm that didn't work very well in the real Internet.
I made the choice of nine bands fairly arbitrarily. Some possibility exists that more or fewer bands, or even a more densely-populated address space make the msblast address selection algorithm more efficient than random probing.
I assume that users of NWS will make modifications to the library modules that implement the NWS framework. It seems useless to install NWS as an official Perl module when the users will almost certainly have customized copies of the NWS modules in the directories of specific simulations.
To install NWS, you create a directory into which you unpack the
distribution with a command
similar to gunzip -c nws.tar.gz | tar xf -
.
All the example simulations from this document unpack as well.
Every example simulation gains the use of the NWS framework
by including four Perl "use" directives:
use NWS::Network;
use NWS::Message;
use NWS::Host;
use NWS::Software;
If you create your own simulation, include the same four Perl "use" statements.
This boils down to having a directory named NWS
located in the directory in which simulation code resides.
The NWS
subdirectory must contain the four Perl modules
(Network.pm
, Message.pm
, Host.pm
, Software.pm
)
that implement the NWS framework.
An example of a very simple worm will help the reader understand the NWS simulator. The following 28 lines of Perl illustrate how to write a simulator for a simple Susceptible-Infected-Susceptible model.
1 #!/usr/bin/perl
2 use NWS::Network;
3 use NWS::Message;
4 use NWS::Host;
5 use NWS::Software;
6
7 my $worm_code = q{
8 my ($host, $software, $phase) = @_;
9 $host->SendMsg(
10 Message->new(
11 $host->RandAddress,
12 'OS-CPU',
13 'Victim',
14 $software->{code})
15 ) if $phase == $Host::Run;
16 $software->{true_name} = 'Worm' if $phase == $Host::Init2;
17 };
18
19 my $network = Network->new(65535, 0);
20 $network->AddHost(Host->new('OS-CPU', 0, Software->new('Worm', $worm_code, 0)));
21 $network->AddHost(Host->new('OS-CPU', 0, Software->new('Victim', '', 1)))
22 for (1..10000);
23
24 for (my $i = 1; $i < 135; $i += 5) {
25 $network->Run($i, 5)->PrintCounts;
26 }
27
28 exit 0;
$Host::Run
code,
and setting the NWS::Software's "true name" to "Worm"
if called with the $Host::Init2
code.
NWS::Host
and
NWS::Software
that provide a context for the code to
execute in, and a numeric indication of what "phase" the worm
got called during.
NWS::Message
object created on lines 10-14. The
Host::RandAddress
method
returns a random address in the address space of the
NWS::Network
containing the Host
instance.
Message
object's
identifier
instance variable to the value "OS-CPU".
Host
instances will ignore Message
objects
sent to them that do not possess an identifier
that
matches their own identifier
.
Message
object's
software
instance variable to "Victim".
When a Host
instance accepts a Message
based on matching identifier
instance variable
values, the Host
tries to find an instance of
Software
based on the value of the Message's
software
instance variable.
NWS::Software
instance to "Aworm".
Every instance of class NWS::Software
has an identifier
and a true name. This allows the worm to infect
Software
instances based on
identifier, and it allows a simulation to report on infections by true name.
It also illustrates a difficult real world problem: how does one name
what Code Red does to an IIS process? A Code-Red-infected-IIS still
contains (if not executes) IIS code.
NWS::Network
object.
The NWS::Network
object
connects all the instances of class NWS::Host
that take part in the simulation, and does
the work of passing messages between NWS::Host
instances. This simulation gives this Network object a 16-bit address space,
and lets the NWS::Network
code choose a seed value for the random number generator.
NWS::Network
object
with NWS::Host
objects. The 9999 "victim" Host
instances each contain
an instance of class Software
with a name that the worm code knows.
The "victim" software
gets marked vulnerable.
NWS::Software
instances contained in
the simulation's collection of NWS::Host
instances every 5 time steps.
The simulation calls NWS::Network::PrintCounts
method to do the output.
When invoked from a Linux command line prompt, the simple worm simulator produces output that looks like this:
4:16PM 642 % ./simple.pl
#step, msg count, Victim, Worm
5 9 9999 2
#step, msg count, Victim, Worm
10 10 9999 2
#step, msg count, Victim, Worm
15 15 9998 3
#step, msg count, Victim, Worm
20 23 9994 7
#step, msg count, Victim, Worm
25 52 9989 12
The NWS::Network::PrintCounts
method produces two lines of output each time
that a program calls it.
The first line begins with the '#' (pound sign) character, and lists the
last time step number, the number of messages sent since the last call
to NWS::Network::PrintCounts
,
followed by true names of the NWS::Software
instances in the simulation.
The second line contains a tab-separated list of numbers, counts
of the same true names of NWS::Software
instances that appear on
the first line.
This output format served me well, allowing interactive and batch
graphing using Gnuplot.
It may not serve others as well. NWS::Network::PrintCounts
does
not use any information other than that contained in NWS::Network
instance variables, so any NWS user can re-write PrintCounts
to
suit him or her self.
Instances of NWS::Network
model all the wiring and routers and gateways
that compose a single real network.
A program using the NWS simulation framework should only bother to create a single instance
of class NWS::Network
. The instance of Network contains and connects instances
of class Host. The Network instance calls NWS::Host::RunProcess
method on each
contained instance of NWS::Host
, which can cause a NWS::Host
to send a NWS::Message
instance to
some other NWS::Host
contained by the NWS::Network
instance. NWS::Network
instances
deliver NWS::Message
instances around the simulated network.
new NWS::Network $address_space $seed
$address_space
argument should amount
to the largest
address in the simulation. If you don't pass in a seed for the random
number generator, The NWS::Network
constructor will choose
one for you. Passing in a particular seed allows you to re-run a simulation
to see if you get identical results. my $network = new NWS::Network 10000;
The above Perl fragment creates an NWS::Network
instance
that allows a maximum address of 10000, and seeds the simulation's random
number generator with a value of its own making.
GetSeed
: returns the number used as the seed of the random number generator.
Example use: print "Simulation's PRNG seed: ", $network->GetSeed, "\n";
AddHost
: add an NWS::Host
entity to the network.
Gives the Host
an address if it doesn't already have one
(NWS::Host::address
evaluates to 0 or undefined). Will not put
a Host
at address zero. Will not give two Host
instances
the same address if it picks the addresses. By way of an example:
my $host = new Host 'Solaris-SPARC', 0;
$host->AddSoftware(new Software, 'snmpXdmid', '', 1);
my $address = $network->AddHost($host);
print "Host added at address ", $address, "\n";
The example creates an NWS::Host
instance with
vulnerable software, and an address of zero. Inserting
$host
into the NWS::Network
will cause
$network
to select a random address in its address
space for $host
.
AddHost
returns the numerical address at which it
placed the Host
. This allows the NWS user to place victims
randomly in a simulation's address space, and still build up a "hit list"
for the initial worm instance.
Run
: execute the simulation. Call this on an NWS::Network
instance after you've added hosts and performed any other setup. Run
takes two scalar arguments: the time step it should start counting from,
and a number of time steps to execute before returning.
The internal state of any give Network
instance stays constant
between calls to Run
unless a user's simulation code changes it.
Numbering time steps is just a convenience to the user.
Returns a reference
to the NWS::Network
instance on which you call it.
for (my $i = 1; $i < 100; $i += 5 ) {
$network->Run($i, 5)->PrintCounts;
}
The above code fragment executes 100 time steps on the contents of
$network
, 5 steps at a time. After each 5 time steps,
the example executes PrintCounts
on $network
.
If you only care about end results of a simulation, you can have
all time steps executed without interruption:
$network->Run(1, 100);
PrintCounts
:
Each call to this method prints two newline-terminated lines
of text to STDOUT.
The first line begins with an octothorpe ('#
')
character, and contains tab-separated column names.
The second line contains the numeric values of the
columns named in the first line. Each second line contains
time step number and a count of all Message
objects
sent since the last call to PrintCounts
.
Each second line also contains a count of the number
of Software
instances present in the
Network
when PrintCounts
executes.
The Software
instance counts go by the value
of the true_name
instance variable.
Letting the AddHost
method prefer to place entities at
random addresses in a simulation's address space may constitute a flaw.
This preference may lead NWS users to assume that the real Internet,
or even a corporate intranet, has a randomly-addressed population.
Simulations that don't match reality closely may result.
hosts
: a hash of NWS::Host
instances, keyed by
numeric address.
foreach my $h (values %{$network->{hosts}}) {
print "Host at", $h->{address}, " received ",
$h->{msg_cnt}->{rcvd}, " total probes\n";
}
The above example enumerates all NWS::Host
instances
contained in $network
.
total_msg_count
: a scalar count of how many NWS::Message
objects have been sent so far.
total_hit_count
: a scalar count of how many NWS::Message
objects have been delivered to NWS::Host
instances.
Use Host::msg_cnt
instance variable
to obtain a per-host
count of messages that got delivered and learn how the host accepted them.
The last two instance variables mainly have use in determining if
a given simulation correctly sends and receives enough messages.
For example, over a large enough number of time steps, randomly
addressed messages should hit enough hosts to make
the ratio of total_hit_count
to total_msg_count
equal the ratio of
number of hosts to address space size.
Instances of class NWS::Host
represent computers attached to some
form of a network.
Each NWS::Host
object has a numerical address associated with it
by the NWS::Network
object in a simulation,
and can contain zero or more NWS::Software
instances.
Host
requires two arguments:
a string identifying the "type" of the host, like "Linux-x86" or
"Solaris-SPARC", and a number, the desired address for the newly
constructed Host
when you place it in a Network
via NWS::Network::AddHost
.
my $susceptible = new Host "OS-CPU", 0;
$network->AddHost($susceptible);
Since the above code fragment constructs Host
instances
with an address of zero, the Network
object places the
Host
instances randomly in its address space.
AddSoftware
: put NWS::Software
into a Host
instance. Used in setup of simulations,
but also useful in worm code, to install "back door" software as
so many real life worms do.
my $wormhost = new Host "Linux-x86", 0; # network assigns address
my $sw = new Software "ADMworm", $worm_code, 0;
$wormhost->AddSoftware($sw); # host infected at simulation start
NWS::Host
only allows one Software
instance
of a particular identifier. Since the identifier
value
gets used like TCP or UDP port numbers, this limitation shouldn't
cause too much difficulty.
RemoveSoftware
: remove a specific instance of NWS::Software
from a Host
instance.
# Remove the offending exploited software
$host->RemoveSoftware('bind');
Note that the string passed to RemoveSoftware
removes
based on value of identifier
instance variable of the
Software
, not on the true_name
.
RandAddress
: returns a scalar, a number somewhere in
the simulation's address space. Useful to random probing worms:
$host->SendMsg(
new Message
$host->RandAddress,
"Windows",
"IIS",
$software->{function}
);
The NWS::Host::RandAddress
method actually calls a method
NWS::Network::RandAddress
, because Network
objects actually know the size of the address space. You may wish
to override NWS::Host::RandAddress
with a different
programmable random number generator
(PRNG) if you disbelieve in Perl's built in PRNG, or you want to
simulate a worm that uses a buggy or poorly-designed PRNG, like
Code Red I v1.
TimeStep
: returns a scalar, the numerical value of the
current time step. Might have use to when simulating something like Code Red
going inactive the 20th of every month.
SendMsg
: initiate delivery of an NWS::Message
object to an address in the simulation. Message
instances
sent to an empty address get silently discarded. No notion of successful
or unsuccessful delivery gets returned to the sender.
$host->SendMsg(
Message->new(
67, 'OS-CPU', 'Exploited',
$software->{function}
)
);
The above code fragment sends a Message
to address 67.
Should NWS find a Host
at address 67, NWS follows the
process described below
in RecvMsg
to either ignore the Message
,
pass the Message
to a Perl function (worm code),
or exploit a Software
instance in the host at address 67.
RecvMsg
: method called by Network
instance
that contains the Host
. Causes the Host
to
decide whether to accept the message (message identifier
matches the host identifier
), deliver the message to some specific
Software
instance (value of message's software
instance variable matches the name of some Software
instance
in the host), exploit the Software
(matching Software
instance has a true value in its exploitable
instance variable)
or merely deliver the message to the matching Software
.
RunProcesses
: called by the NWS::Network
instance
that contains the Host
. Causing all executable functions
contained by Software
instances in the Host
to
get run once. Called from NWS::Network::Run
once each
time step.
identifier
: a string that allows simulations to distinguish
between varieties of Host
instances. The value of identifier
determines which messages (NWS::Message
instances) the
Host
will accept. NWS::Message
instances contain
an identifier that must match the value of Host
instance
variable identifier
before any further processing gets done.
The simulations in this document used a single identifier, but simulations
of heterogeneous networks would use several.
address
: a scalar, the numerical value used to locate
Host
instances in a network. Set by the NWS::Host
constructor, but a zero value from construction lets
the NWS::Network
instance
that a Host
gets assigned to pick the address at random.
internet
: a reference to the NWS::Network
instance
containing the Host
. Allows worm code to get access to
any member functions or instance variables of NWS::Network
.
msg_cnt
hash containing four keys:
rcvd
: the number of NWS::Message
instances
delivered to this host.
matched
: the number of NWS::Message
instances
that matched this host's identifier
value.
Less than or equal to the value of rcvd
accepted
: the number of NWS::Message
instances
delivered to this host that got delivered to an instance of
NWS::Software
contained in the host.
Less than or equal to the value of matched
.
exploited
: the number of NWS::Message
instances
delivered to this host that exploited a vulnerability in the
NWS::Software
that accepted the messages.
Less than or equal to the value of accepted
.
msg_seq_no
:
A count of how many Message
objects this Host
has sent out. msg_seq_no
gets incremented and used as
the value of a Message
object's
sequence
instance variable.
Using msg_cnt
correctly can allow you to keep track
of the efficiency of a given address probing scheme.
Instances of class NWS::Host
can contain instances of class
NWS::Software
.
NWS::Software
instances have a name, so that an NWS::Host
can contain
more than one instance of NWS::Software
. This situation models real
computers, which often run hundreds of different processes at any given instant.
Instances of NWS::Software
can contain actual code -
fragments of Perl code
that get actually compiled and executed when the simulation's NWS::Network
tells
the NWS::Host
objects to run their processes.
NWS::Software
only has a constructor method.
Creating a new instance of Software
requires three arguments: a string naming the Software
,
a string containing a fragment of Perl code, and a number 0 or 1.
Example NWS::Software
constructor use:
my $annunciator_code = qw{
my ($host, $software, $phase, $msg) = @_;
print "Host at ", $host->{address}, " called during phase ",
$phase, "\n";
};
my $sw = Software->new(
"Call phase Annunciator",
$annunciator_code,
0
);
The string naming the Software
gets used to decide whether or
not to accept a given NWS::Message
object sent to a Host
.
The fragment of Perl code actually gets compiled (via an eval BLOCK;
construct) and runs during simulation execution.
The number indicates whether the Software
is vulnerable
(1) or not (0).
The constructor method compiles any Perl fragment passed
as an argument,
and it sets identifier
and true_name
instance
variables to the value of the first argument of the constructor.
After that, the constructor calls the compiled Perl fragment
with the $Host::Init
phase argument.
Worm code should change the true_name
value
at this point.
identifier
: a scalar, a string used by Host
instances to keep track of the Software
instances they contain.
Also used to discriminate between Message
instances received
by a Host
: a Message
has an instance variable
that has to lexically equate to a Software
instance's
identifier
. In this respect, the Software
instance's
identifier
acts something like a TCP or UDP port number. It
serves to direct message traffic to the intended process on the addressed host.
All the examples in this document use a phrase like "Victim" or "IIS",
but simulations could use digit-strings
for identifiers, too, increasing the verisimilitude of the simulation.
function
: a reference to a closure created at construction
from the Perl fragment passed as a string argument to the constructor.
This is the "worm code" that this document frequently mentions. Called
as a function at specific times during
a simulation's execution.
exploitable
: a scalar, used as a boolean to decide whether
or not a particular Software
instance should be exploited
by a just-received Message
, or whether the Message
should be passed to the Software
instance's function.
That is, Software
instances have to be marked not exploitable
before they receive messages with a phase of $Host::Recv
.
true_name
: a string. This string doesn't get used any
where except in the NWS::Network::PrintCounts
method, and it
has no real life analog.
The lack of real life analogy points out a real life difficulty:
exactly how does one tell what a running process is?
Infection of SQLServer by the "slammer" worm didn't change the process
ID of the infected process. true_name
exists so that
an NWS user can write worms that by convention set true_name
to indicate infection. Without a distinguishing characteristic like
true_name
, NWS users would have to look at counts of
messages flowing through the system to decide what a worm population
was doing, just as real life network administrators do.
Instances of class NWS::Message
model frames or packets, and constitute the
hunks of information that NWS::Host
objects can pass between each other.
The NWS simulations act as if the Internet mainly communicated using UDP/IP instead of TCP/IP.
NWS::Message
only has a constructor. Arguments to the constructor
set contents of all the instance variables that a user is responsible for:
my $probe = new Message
$host->RandAddress, # Destination address
"Windows-x86", # Host identifier
"IIS", # Software identifier
$software->{function}; # Worm code
destination
: a number, the address to which some
entity sent the Message
.
Equates numerically to the recipient Host's
address.
Set by the user via the constructor method.
source
: a number, the address of the entity that sent the Message
. Set in NWS::Host::SendMsg
Useful to send Message's
back to an attacking worm.
identifier
: a string.
Has to lexically equate to a Host's
identifier before the
Host
will accept the message.
Allows simulating heterogenous networks.
Set by the user via the constructor method.
software
: a string. The name of the Software
for which the Message
is intended.
Analogous to TCP or UDP port number. All examples in this document use names
like "IIS" or "BIND", but "80" or "53" or "135" would work just as well.
Set by the user via the constructor method.
code
: function pointer. Used to transport "worm code" from
Host
to Host
.
Set by the user via the constructor method.sequence
: a number, increasing from 1 for each Message
a given Host
sends. Set in NWS::Host::SendMsg
An entity sending an NWS::Message
only has to fill in
destination
, identifier
and software
for the Message
to pass through the delivery algorithm.
The code
instance variable is the means by which a
worm passes executable content to another entity in the simulation.
Usually, a programmer uses the NWS::Software::function
contents to fill in code
, but this isn't enforced.
Users of the NWS simulator see worm code as text, more specifically
as fragments of the Perl programming language.
At runtime, an NWS simulation creates instances of class NWS::Software
.
If a programmer passes a Perl code fragment to the constructor
of NWS::Software
, the code fragment actually gets compiled via
an eval
statement. The instance of NWS::Software
so
constructed carries around a reference to the compiled code fragment,
analogous to a function pointer in other programming languages.
The following 26-line fragment of Perl constitutes a fairly sophisticated worm. When executed in an NWS simulation, it sends infectious messages to random addresses. If it receives an infectious message from another worm instance, it replies back. If it receives a reply to an infectious message, it marks itself inactive, and sends no further messages.
1 my $worm_code = q{
2 my ($h, $sw, $c, $m) = @_;
3 if ($c == $Host::Run and $sw->{active}) {
4 $h->SendMsg(
5 new Message
6 $h->RandAddress,
7 "OS-CPU",
8 "Victim",
9 $sw->{function}
10 );
11 } elsif ($c == $Host::Recv) {
12 if (!$m->{already_infected}) {
13 $m->{destination} = $m->{source};
14 $m->{already_infected} = $h->{address};
15 $m->{code} = 0;
16 $h->SendMsg($m);
17 } else {
18 $sw->{true_name} = 'Worm - inactive';
19 $sw->{active} = 0;
20 }
21 } elsif ($c == $Host::Init2 or $c == $Host::Init) {
22 $sw->{true_name} = 'Worm - active';
23 $sw->{exploitable} = 0;
24 $sw->{active} = 1;
25 }
26 };
$sw->{function}
.
Message
:
already_infected
to the NWS::Message
instance before sending it back.
Message
referenced by $m
has an instance
variable named already_infected
that has a value.
Line 19 sets the active
instance variable to 0,
which means that the test on line 3 fails any subsequent times
the worm code gets called with a $c
value of
$Host::Run
. After executing line 19, the worm no
longer probes for susceptible hosts.
Software
instance construction,
and upon exploiting a vulnerable host.
This piece of code sets the true_name
instance variable,
marks the Software
as not vulnerable, and takes advantage
of the Perl feature of dynamically adding instance variables to
create and set the active
instance variable.
If there is a trick to writing worm code, it's to realize that you have to write code that's not only thread-safe, but also re-entrant.
The code contained by instances of class NWS::Software
gets executed in
four different situations.
NWS::Host
adds the software, in method NWS::Host::AddSoftware
NWS::Network
instance calls NWS::Host::RunProcess
on each NWS::Host
instance it contains at every time step.NWS::Message
object makes its way from one NWS::Host
to another. If the receiving NWS::Host
finds
the message appropriate to its "operating system" and "CPU" it
passes the incoming message to the appropriately named NWS::Software
instance. The NWS::Software's
code gets executed, and the NWS::Message
instance gets passed
to the code.NWS::Software instance
.
NWS::Software
instances execute the code once on construction
to allow the
code to initialize run-time constants rather than recalculating the constants
at every execution. Similarly, an NWS::Host
causes the execution of any code an
exploiting NWS::Message
might contain just after the exploitation. This allows
worm code to set up and initialize. Running code in NWS::Software
instances
that receive a NWS::Message
allows NWS::Hosts
to cooperate, or it allows
writing worms like CRclean, that style "strike backs"
when they receive a particular NWS::Message
.
Code for a worm doesn't have to execute anything in any of the four situations. In the simple worm example section, line 15 constitutes the test for execution of the worm code during normal network operation. Line 16 tests for execution just after exploitation. The simple worm code in the example gets executed in the two other situations (on message reception and at installation), but does nothing.
When an NWS simulation executes the NWS::Network::Run
method,
any compiled code fragments get called. The arguments to the compiled
code fragment look like this:
NWS::Host
instance currently causing the code fragment to executeNWS::Software
instance that contains the code fragment$Host::Init
: indicates that the code runs just after
NWS::Software
compiled it with an eval
.$Host::Init2
: indicates that the code runs just
after it has infected a new host.
$Host::Run
: indicates that the code runs as its
normal, once-per-time-step turn.$Host::Recv
: indicates that the code has received
a Message
object from some other entity in the simulation.
Host::Recv
and Host::Init2
, the relevant NWS::Message
object
A Message
object doesn't always appear in the
worm code's argument list. When a Message
appears, the
phase of execution determines what the Message
contains.
If the phase argument equates to $Host::Init2
, the
Message
is the infectious communication that caused
a new replicated worm instance to start. If the phase argument
equates to $Host::Recv
, the Message
argument came from a worm instance executing in some other
Host
in the network.
I believe that recounting what I did to arrive at the current
arrangement of worm code and its calling conventions would help
the reader to understand how to write better worm code.
Initially, I had every NWS::Software
instance carrying
around worm code as a string. Every time a NWS::Software
instance
got called upon to execute its code, an eval
of the string
took place.
I did this because I didn't have a good idea of what information
(including what objects) worm code would need access to in order
to act realistically. This technique worked, but it did cost a
lot of execution time to eval
a string in every worm during
every time step.
My second effort involved having NWS::Software
instances
carry around worm code as a string, and additionally carry the
worm code around as a pointer to a closure (a dynamically
compiled anonymous function). Well-known names for information
important to a worm's execution ($host
, $software
)
existed when the code-as-string got compiled to a closure.
Worms would send the code-as-string to potential targets.
Upon exploitation, targets compiled the code-as-string to their own
private closures. This provided a fair performance increase, since
eval
didn't happen for every worm for every time step,
and it also narrowed the information available to any given worm
to well-known variable names present when strings got evaled to
closures.
To allow for simulation of different worm behavior, I had
added arguments to calls to the worm code: a number indicating
the "phase" of execution, and the message just received.
I noticed that only a few well-know variables got used in
even the most elaborate simulations: references to the NWS::Host
and NWS::Software
that contained the worm code.
I also realized that the simulations spent large amounts of time compiling the
worm code to a closure ever time a worm exploited a vulnerability.
This caused me to change from compiling a closure for every infected
NWS::Software
instance to compiling a single closure.
Instead of passing worm-code-as-text, the NWS::Message
instances now contain a pointer to the appropriate closure.
Instead of having well-known variables available to the closure
at compile time, I changed the worm code calling convention to include
exactly those well-known variables as arguments. This brought down
execution time and overall memory usage of any given simulation.
Simulate Network Address Translation gateways and firewalls by
including code in class NWS::Network
that allows one
instance of NWS::Network
to contain a second instance of NWS::Network
.
The second instance should
pass messages appropriately by translating addresses,
and contain a second set of
NWS::Host
instances.
One could use such an enhanced NWS::Network
class to
simulate Code Red II's
spread behind firewalls and gateways.
The current NWS simulation framework includes nothing with a performance problem. Real life networks have devices like firewalls or routers or specific software that suffers during a network worm epidemic. Rate limited devices or software end up getting bogged down by worm-related traffic, and either take a long time to deal with legitimate traffic, or drop legitimate traffic entirely.
Adding rate-limiting bottlenecks to a network simulation would allow the use of NWS as more than a worm design tool. Users could develop a worm and a network simulation without bottlenecks, than add or turn on bottleneck objects to see where the worm causes network failures.
Rate limiting could take the form of a limit on how many
NWS::Message
instances can pass in a time step,
whether the remaining
messages get queued or discard, or it could take the form of
shutting down if the number of Messages
to pass gets too large.
A combination of rate limited devices and gatewayed networks could help to confirm Cliff C Zou's two-parameter Code Red model [12].
Eric S. Raymond
http://www.catb.org/~esr/jargon/html/W/worm.html
Alexander Bartolich, 2003-02-15
http://www.linuxsecurity.com/resource_files/documentation/virus-writing-HOWTO/_html/intro.html#what.exactly.is.a.virus
Cliff C. Zou, Down Towsley, Weibo Gong
University of Massachusetts, Amherst, Technical Report TR-CSE-03-04
http://tennis.ecs.umass.edu/~czou/research/emailvirus-techreport.pdf
Eric S. Raymond
http://www.jargon.net/jargonfile/f/forkbomb.html
Mark Ward, Tuesday, 22 May, 2001, 12:04 GMT 13:04 UK
http://news.bbc.co.uk/1/hi/sci/tech/1344344.stm
Greg Moorer
23rd National Information Systems Security Conference Proceeding
October 16-19, 2000
http://csrc.ncsl.nist.gov/nissc/2000/proceedings/papers/601.pdf
Kevin Poulsen
The Register, http://www.theregister.co.uk/
22/05/2001 at 19:01 GMT
http://www.theregister.co.uk/content/archive/19132.html
As of August 2003, the Apache web server ran on 63.98% of sites surveyed. Microsoft's IIS ran on 23.75%
http://news.netcraft.com/archives/web_server_survey.html
Stuart Staniford, Vern Paxson, Nicholas Weaver
Proceedings of the 11th USENIX Security Symposium (Security '02)
http://www.usenix.org/publications/library/proceedings/sec02/
August 5-9, 2002
http://www.icir.org/vern/papers/cdc-usenix-sec02/
David Moore, Colleen Shannon, Geoffrey M. Voelker, Sefan Savage
Proceedings of the 2003 IEEE Infocom Conference
April, 2003
http://www.cs.ucsd.edu/users/savage/papers/Infocom03.pdf
Cliff Changchun Zou, Weibo Gong, Don Towsley
9th ACM Conference on Computer and Communication Security (CCS'02)
November 18-22, 2002
http://tennis.ecs.umass.edu/~czou/research/codered.pdf
Michael Liljenstam
http://www.cs.dartmouth.edu/~mili/research/ssf/worm/
Jeffrey O. Kephart and Steve R. White
Proceedings of the 1991 IEEE Computer Society Symposium on Research in Security and Privacy
May 20-22, 1991
http://www.research.ibm.com/antivirus/SciPapers/Kephart/VIRIEEE/virieee.gopher.html
June 1991
http://www.gnu.org/licenses/gpl.txt
Craig Labovitz, Abha Ahuja, Michael Bailey
November 13, 2001
http://arbornetworks.com/downloads/dark_address_space.pdf
Luiz Henrique de Figueiredo, Roberto Ierusalimschy, and Waldemar Celes
Dr. Dobb's Journal 21 #12 (Dec 1996) 26-33.
http://www.lua.org/ddj.html
Ben Bolker
14 Nov 2000
http://www.zoo.ufl.edu/bolker/eep-2000/si1.html
Cliff Stoll
RISKS-FORUM Digest Wednesday 4 January 1989 Volume 8 : Issue 2
27 Dec 88
ftp://ftp.sri.com/risks/8/risks-8.02
Max Vision
circa May 1998
http://www.whitehats.com/library/worms/adm/index.html
Johannes Ullrich
2002-09-16
http://isc.incidents.org/analysis.html?id=167
F-Secure Corporation
http://www.europe.f-secure.com/slapper/
University of British Columbia Department of Mathematics
http://www.ugrad.math.ubc.ca/coursedoc/math101/notes/moreApps/logistic.html
Joel Feldman
http://www.math.ubc.ca/~feldman/math/odesolvers.pdf
David Smith and Lang Moore, Duke University
http://www.math.duke.edu/education/ccp/materials/postcalc/sir/contents.html
Markus Kern
Sept 1 2001
VULN-DEV email list
http://www.securityfocus.com/archive/82/211462
David Goldsmith
2002-10-02
http://isc.incidents.org/analysis.html?id=177
Max Vision
circa April, 2001
http://www.whitehats.com/library/worms/lion/index.html
Kevin Houle
Thursday, May 17, 2001
http://www.cert.org/incident_notes/IN-2001-05.html
Chad Dougherty, Jeffrey Havrilla, Shawn Hernan, and Marty Lindner
August 14, 2003
http://www.cert.org/advisories/CA-2003-20.html
Rolf Rolles
August, 2003
VULN-DEV email list
http://lists.insecure.org/lists/vuln-dev/2003/Aug/att-0029/RPC_DCOM_recode_and_analysis.TXT
Sean McCreary and kc claffy
8/27/98
http://www.caida.org/outreach/resources/learn/ipv4space/
I did not use large numbers of tools, or an expensive hardware setup to develop NWS.
The computer running SuSE 7.3 linux had a 1 GHz AMD Duron processor on a Micro ATX MS-6340M motherboard, with 256 Mb of memory. This computer ran 20, 135-time step NWS simulations in about 15 minutes wall clock time.