Virus Writing Guide Part 4.

Prev || Home || Next

bar.gif (11170 bytes)

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
DISCLAIMER: This file is 100% guaranteed to
exist. The author makes no claims to the
existence or nonexistence of the reader.
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
This space intentionally left blank.
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
GREETS: Welcome home, Hellraiser! Hello to
the rest of the PHALCON/SKISM crew: Count
Zero, Demogorgon, Garbageheap, as well as
everyone else I failed to mention.
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

Dark Angel's Clumpy Virus Writing Guide
ÄÄÄÄ ÄÄÄÄÄÄÄ ÄÄÄÄÄÄ ÄÄÄÄÄ ÄÄÄÄÄÄÄ ÄÄÄÄÄ
"It's the cheesiest" - Kraft

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
INSTALLMENT IV: RESIDENT VIRII, PART I
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ

Now that the topic of nonresident virii has been addressed, this series now
turns to memory resident virii. This installment covers the theory behind
this type of virus, although no code will be presented. With this
knowledge in hand, you can boldly write memory resident virii confident
that you are not fucking up too badly.

ÄÄÄÄÄÄÄÄÄÄ
INTERRUPTS
ÄÄÄÄÄÄÄÄÄÄ
DOS kindly provides us with a powerful method of enhancing itself, namely
memory resident programs. Memory resident programs allow for the extention
and alteration of the normal functioning of DOS. To understand how memory
resident programs work, it is necessary to delve into the intricacies of
the interrupt table. The interrupt table is located from memory location
0000:0000 to 0000:0400h (or 0040:0000), just below the BIOS information
area. It consists of 256 double words, each representing a segment:offset
pair. When an interrupt call is issued via an INT instruction, two things
occur, in this order:

1) The flags are pushed onto the stack.
2) A far call is issued to the segment:offset located in the interrupt
table.

To return from an interrupt, an iret instruction is used. The iret
instruction reverses the order of the int call. It performs a retf
followed by a popf. This call/return procedure has an interesting
sideeffect when considering interrupt handlers which return values in the
flags register. Such handlers must directly manipulate the flags register
saved in the stack rather than simply directly manipulating the register.

The processor searches the interrupt table for the location to call. For
example, when an interrupt 21h is called, the processor searches the
interrupt table to find the address of the interrupt 21h handler. The
segment of this pointer is 0000h and the offset is 21h*4, or 84h. In other
words, the interrupt table is simply a consecutive chain of 256 pointers to
interrupts, ranging from interrupt 0 to interrupt 255. To find a specific
interrupt handler, load in a double word segment:offset pair from segment
0, offset (interrupt number)*4. The interrupt table is stored in standard
Intel reverse double word format, i.e. the offset is stored first, followed
by the segment.

For a program to "capture" an interrupt, that is, redirect the interrupt,
it must change the data in the interrupt table. This can be accomplished
either by direct manipulation of the table or by a call to the appropriate
DOS function. If the program manipulates the table directly, it should put
this code between a CLI/STI pair, as issuing an interrupt by the processor
while the table is half-altered could have dire consequences. Generally,
direct manipulation is the preferable alternative, since some primitive
programs such as FluShot+ trap the interrupt 21h call used to set the
interrupt and will warn the user if any "unauthorised" programs try to
change the handler.

An interrupt handler is a piece of code which is executed when an interrupt
is requested. The interrupt may either be requested by a program or may be
requested by the processor. Interrupt 21h is an example of the former,
while interrupt 8h is an example of the latter. The system BIOS supplies a
portion of the interrupt handlers, with DOS and other programs supplying
the rest. Generally, BIOS interrupts range from 0h to 1Fh, DOS interrupts
range from 20h to 2Fh, and the rest is available for use by programs.

When a program wishes to install its own code, it must consider several
factors. First of all, is it supplanting or overlaying existing code, that
is to say, is there already an interrupt handler present? Secondly, does
the program wish to preserve the functioning of the old interrupt handler?
For example, a program which "hooks" into the BIOS clock tick interrupt
would definitely wish to preserve the old interrupt handler. Ignoring the
presence of the old interrupt handler could lead to disastrous results,
especially if previously-loaded resident programs captured the interrupt.

A technique used in many interrupt handlers is called "chaining." With
chaining, both the new and the old interrupt handlers are executed. There
are two primary methods for chaining: preexecution and postexecution. With
preexecution chaining, the old interrupt handler is called before the new
one. This is accomplished via a pseudo-INT call consisting of a pushf
followed by a call far ptr. The new interrupt handler is passed control
when the old one terminates. Preexecution chaining is used when the new
interrupt handler wishes to use the results of the old interrupt handler in
deciding the appropriate action to take. Postexecution chaining is more
straightforward, simply consisting of a jmp far ptr instruction. This
method doesn't even require an iret instruction to be located in the new
interrupt handler! When the jmp is executed, the new interrupt handler has
completed its actions and control is passed to the old interrupt handler.
This method is used primarily when a program wishes to intercept the
interrupt call before DOS or BIOS gets a chance to process it.

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
AN INTRODUCTION TO DOS MEMORY ALLOCATION
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
Memory allocation is perhaps one of the most difficult concepts, certainly
the hardest to implement, in DOS. The problem lies in the lack of official
documentation by both Microsoft and IBM. Unfortunately, knowledge of the
DOS memory manager is crucial in writing memory-resident virii.

When a program asks DOS for more memory, the operating system carves out a
chunk of memory from the pool of unallocated memory. Although this concept
is simple enough to understand, it is necessary to delve deeper in order to
have sufficient knowledge to write effective memory-resident virii. DOS
creates memory control blocks (MCBs) to help itself keep track of these
chunks of memory. MCBs are paragraph-sized areas of memory which are each
devoted to keeping track of one particular area of allocated memory. When
a program requests memory, one paragraph for the MCB is allocated in
addition to the memory requested by the program. The MCB lies just in
front of the memory it controls. Visually, a MCB and its memory looks
like:

ÚÄÄÄÄÄÄÄÂÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ MCB 1 ³ Chunk o' memory controlled by MCB 1 ³
ÀÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ

When a second section of memory is requested, another MCB is created just
above the memory last allocated. Visually:

ÚÄÄÄÄÄÄÄÂÄÄÄÄÄÄÄÄÄÂÄÄÄÄÄÄÄÂÄÄÄÄÄÄÄÄÄ¿
³ MCB 1 ³ Chunk 1 ³ MCB 2 ³ Chunk 2 ³
ÀÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÙ

In other words, the MCBs are "stacked" one on top of the other. It is
wasteful to deallocate MCB 1 before MCB 2, as holes in memory develop. The
structure for the MCB is as follows:

Offset Size Meaning
ÄÄÄÄÄÄ ÄÄÄÄÄÄÄ ÄÄÄÄÄÄÄ
0 BYTE 'M' or 'Z'
1 WORD Process ID (PSP of block's owner)
3 WORD Size in paragraphs
5 3 BYTES Reserved (Unused)
8 8 BYTES DOS 4+ uses this. Yay.

If the byte at offset 0 is 'M', then the MCB is not the end of the chain.
The 'Z' denotes the end of the MCB chain. There can be more than one MCB
chain present in memory at once and this "feature" is used by virii to go
resident in high memory. The word at offset 1 is normally equal to the PSP
of the MCB's owner. If it is 0, it means that the block is free and is
available for use by programs. A value of 0008h in this field denotes DOS
as the owner of the block. The value at offset 3 does NOT include the
paragraph allocated for the MCB. It reflects the value passed to the DOS
allocation functions. All fields located after the block size are pretty
useless so you might as well ignore them.

When a COM file is loaded, all available memory is allocated to it by DOS.
When an EXE file is loaded, the amount of memory specified in the EXE
header is allocated. There is both a minimum and maximum value in the
header. Usually, the linker will set the maximum value to FFFFh
paragraphs. If the program wishes to allocate memory, it must first shrink
the main chunk of memory owned by the program to the minimum required.
Otherwise, the pathetic attempt at memory allocation will fail miserably.

Since programs normally are not supposed to manipulate MCBs directly, the
DOS memory manager calls (48h - 4Ah) all return and accept values of the
first program-usable memory paragraph, that is, the paragraph of memory
immediately after the MCB. It is important to keep this in mind when
writing MCB-manipulating code.

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
METHODS OF GOING RESIDENT
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
There are a variety of memory resident strategies. The first is the use of
the traditional DOS interrupt TSR routines, either INT 27h or INT
21h/Function 31h. These routines are undesirable when writing virii,
because they do not return control back to the program after execution.
Additionally, they show up on "memory walkers" such as PMAP and MAPMEM.
Even a doorknob can spot such a blatant viral presence.

The traditional viral alternative to using the standard DOS interrupt is,
of course, writing a new residency routine. Almost every modern virus uses
a routine to "load high," that is, to load itself into the highest possible
memory location. For example, in a 640K system, the virus would load
itself just under the 640K but above the area reserved by DOS for program
use. Although this is technically not the high memory area, it shall be
referred to as such in the remainder of this file in order to add confusion
and general chaos into this otherwise well-behaved file. Loading high can
be easily accomplished through a series of interrupt calls for reallocation
and allocation. The general method is:

1. Find the memory size
2. Shrink the program's memory to the total memory size - virus size
3. Allocate memory for the virus (this will be in the high memory area)
4. Change the program's MCB to the end of the chain (Mark it with 'Z')
5. Copy the virus to high memory
6. Save the old interrupt vectors if the virus wishes to chain vectors
7. Set the interrupt vectors to the appropriate locations in high memory

When calculating memory sizes, remember that all sizes are in paragraphs.
The MCB must also be considered, as it takes up one paragraph of memory.
The advantage of this method is that it does not, as a rule, show up on
memory walkers. However, the total system memory as shown by such programs
as CHKDSK will decrease.

A third alternative is no allocation at all. Some virii copy themselves to
the memory just under 640K, but fail to allocate the memory. This can have
disastrous consequences, as any program loaded by DOS can possibly use this
memory. If it is corrupted, unpredictable results can occur. Although no
memory loss is shown by CHKDSK, the possible chaos resulting from this
method is clearly unacceptable. Some virii use memory known to be free.
For example, the top of the interrupt table or parts of video memory all
may be used with some assurance that the memory will not be corrupted.
Once again, this technique is undesirable as it is extremely unstable.

These techniques are by no means the only methods of residency. I have
seen such bizarre methods as going resident in the DOS internal disk
buffers. Where there's memory, there's a way.

It is often desirable to know if the virus is already resident. The
simplest method of doing this is to write a checking function in the
interrupt handler code. For example, a call to interrupt 21h with the ax
register set to 7823h might return a 4323h value in ax, signifying
residency. When using this check, it is important to ensure that no
possible conflicts with either other programs or DOS itself will occur.
Another method, albeit a costly process in terms of both time and code
length, is to check each segment in memory for the code indicating the
presence of the virus. This method is, of course, undesirable, since it is
far, far simpler to code a simple check via the interrupt handler. By
using any type of check, the virus need not fear going resident twice,
which would simply be a waste of memory.

ÄÄÄÄÄÄÄÄÄÄÄÄÄ
WHY RESIDENT?
ÄÄÄÄÄÄÄÄÄÄÄÄÄ
Memory resident virii have several distinct advantages over runtime virii.
o Size
Memory resident virii are often smaller than their runtime brethern as
they do not need to include code to search for files to infect.
o Effectiveness
They are often more virulent, since even the DIR command can be
"infected." Generally, the standard technique is to infect each file
that is executed while the virus is resident.
o Speed
Runtime virii infect before a file is executed. A poorly written or
large runtime virus will cause a noticible delay before execution
easily spotted by users. Additionally, it causes inordinate disk
activity which is detrimental to the lifespan of the virus.
o Stealth
The manipulation of interrupts allows for the implementation of
stealth techniques, such as the hiding of changes in file lengths in
directory listings and on-the-fly disinfection. Thus it is harder for
the average user to detect the virus. Additionally, the crafty virus
may even hide from CRC checks, thereby obliterating yet another anti-
virus detection technique.

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
STRUCTURE OF THE RESIDENT VIRUS
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
With the preliminary information out of the way, the discussion can now
shift to more virus-related, certainly more interesting topics. The
structure of the memory resident virus is radically different from that of
the runtime virus. It simply consists of a short stub used to determine if
the virus is already resident. If it is not already in memory, the stuf
loads it into memory through whichever method. Finally, the stub restores
control to the host program. The rest of the code of the resident virus
consists of interrupt handlers where the bulk of the work is done.

The stub is the only portion of the virus which needs to have delta offset
calculations. The interrupt handler ideally will exist at a location which
will not require such mundane fixups. Once loaded, there should be no
further use of the delta offset, as the location of the variables is
preset. Since the resident virus code should originate at offset 0 of the
memory block, originate the source code at offset 0. Do not include a jmp
to the virus code in the original carrier file. When moving the virus to
memory, simply move starting from [bp+startvirus] and the offsets should
work out as they are in the source file. This simplifies (and shortens)
the coding of the interrupt handlers.

Several things must be considered in writing the interrupt handlers for a
virus. First, the virus must preserve the registers. If the virus uses
preexecution chaining, it must save the registers after the call to the
original handler. If the virus uses postexecution chaining, it must
restore the original registers of the interrupt call before the call to the
original handler. Second, it is more difficult, though not impossible, to
implement encryption with memory resident virii. The problem is that if
the interrupt handler is encrypted, that interrupt handler cannot be called
before the decryption function. This can be a major pain in the ass. The
cheesy way out is to simply not include encryption. I prefer the cheesy
way. The noncheesy readers out there might wish to have the memory
simultaneously hold two copies of the virus, encrypt the unused copy, and
use the encrypted copy as the write buffer. Of course, the virus would
then take twice the amount of memory it would normally require. The use of
encryption is a matter of personal choice and cheesiness. A sidebar to
preservation of interrupt handlers: As noted earlier, the flags register is
restored from the stack. It is important in preexecution chaining to save
the new flags register onto the stack where the old flags register was
stored.

Another important factor to consider when writing interrupt handlers,
especially those of BIOS interrupts, is DOS's lack of reentrance. This
means that DOS functions cannot be executed while DOS is in the midst of
processing an interrupt request. This is because DOS sets up the same
stack pointer each time it is called, and calling the second DOS interrupt
will cause the processing of one to overwrite the stack of the other,
causing unpredictable, but often terminal, results. This applies
regardless of which DOS interrupts are called, but it is especially true
for interrupt 21h, since it is often tempting to use it from within an
interrupt handler. Unless it is certain that DOS is not processing a
previous request, do NOT use a DOS function in the interrupt handler. It
is possible to use the "lower" interrupt 21h functions without fear of
corrupting the stack, but they are basically the useless ones, performing
functions easily handled by BIOS calls or direct hardware access. This
entire discussion only applies to hooking non-DOS interrupts. With hooking
DOS interrupts comes the assurance that DOS is not executing elsewhere,
since it would then be corrupting its own stack, which would be a most
unfortunate occurence indeed.

The most common interrupt to hook is, naturally, interrupt 21h. Interrupt
21h is called by just about every DOS program. The usual strategy is for a
virus to find potential files to infect by intercepting certain DOS calls.
The primary functions to hook include the find first, find next, open, and
execute commands. By cleverly using pre and postexecution chaining, a
virus can easily find the file which was found, opened, or executed and
infect it. The trick is simply finding the appropriate method to isolate
the filename. Once that is done, the rest is essentially identical to the
runtime virus.

When calling interrupts hooked by the virus from the virus interrupt code,
make sure that the virus does not trap this particular call, lest an
infinite loop result. For example, if the execute function is trapped and
the virus wishes, for some reason, to execute a particular file using this
function, it should NOT use a simple "int 21h" to do the job. In cases
such as this where the problem is unavoidable, simply simulate the
interrupt call with a pushf/call combination.

The basic structure of the interrupt handler is quite simple. The handler
first screens the registers for either an identification call or for a
trapped function such as execute. If it is not one of the above, the
handler throws control back to the original interrupt handler. If it is an
identification request, the handler simply sets the appropriate registers
and returns to the calling program. Otherwise, the virus must decide if
the request calls for pre or postexecution chaining. Regardless of which
it uses, the virus must find the filename and use that information to
infect. The filename may be found either through the use of registers as
pointers or by searching thorugh certain data structures, such as FCBs.
The infection routine is the same as that of nonresident virii, with the
exception of the guidelines outlined in the previous few paragraphs.

ÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
WHAT'S TO COME
ÄÄÄÄÄÄÄÄÄÄÄÄÄÄ
I apologise for the somewhat cryptic sentences used in the guide, but I'm a
programmer, not a writer. My only suggestion is to read everything over
until it makes sense. I decided to pack this issue of the guide with
theory rather than code. In the next installment, I will present all the
code necessary to write a memory-resident virus, along with some techniques
which may be used. However, all the information needed to write a resident
virii has been included in this installment; it is merely a matter of
implementation. Have buckets o' fun!

bar.gif (11170 bytes)

Prev || Home || Next