VC :   Abstract   Details   Implement   Tools   Sources   Conclusion


Detailed view

Let's go from general to particular.

VirtualChain programs have to change their own code.
So compiled languages can not be used to write them, because compilers generate and optimize 'unpredictable' raw code.
Also, the programs would have to be able to recompile themselves.
Sadly for me, a newbie, I found assembler is the best option.
I suspect it can also be done with script languages, but I found less info about them.

Bad news; some tricky pieces of code may not work after virtualchained.
Avoid them.
New viruses has to be written keeping in mind this limitations.

Old viruses has to be readed line by line, and maybe not all of them could be trasladed.
One can try to automate the main of the conversion, and write a virtualchainer engine.
But that few tricky pieces will allways f$%&, and keep the conversion a human task.

Delta offset obtention is usually one of this tricky pieces.
Maybe the easiest way to deal with it is doing that in the very first link, and keep it fixed.
Another easy alternative could be store the new position of the "call" link at permutation time.
If this doesn't make sense to you, forget it until read the source code.

In principle, links can have diferent sizes.
Variable size links can save space and make programs hard to disassemble.
Fixed size links make programs simpler, faster, and easier to write & debug.

This decision is significant.
Fixed size links allow to permutate the whole chain only doing simple traspositions (link swappings).
With variable size links, one has to do it another way. For example, start with nothing, and chain one new link every time.
Nothing new; fixed size allows random access, while variable size only allows sequential access.

Virtualchain wastes space, and fixed size links waste more space.
Can do no much about it, but it's not a big problem.

Yes, some next-link-pointers are useless.
(ex. link := {ret, jmp next_link})
No problem about that.

Remember there are two 'jmp' versions; near and far.
Think now in our chain changing again and again.
One have to force compiler to produce allways far jumps.
Also avoid "loop".

Let's study what have to be recalculated every time one link moves.

Remember there are two types of pointers:
- absolute: move fixed_place,value / ...
- relative: jx displacement, jmp displacement, call displacement
Each one requires his own type of recalculation.

Code can refer itself mainly for this reasons:
- inline data: mov [place],value / lea reg,[ebp+place+10] / ...
- label to jump, or proc to call: jx label / jmp label / call proc
- next-link-jumps: jmp next_link

So, everytime one link moves, in principle, one has to recalculate:
- pointers out: allways the nlj, sometimes other pointers in content part
- pointers in: allways the nlj of the preceding link, and search all the chain for something pointing the content part

Not allways it has to be done one by one, and by a searching process; better algorithms exist.
See the example virus; it uses a well mixer function who is not random, so one knows well where links go.
Another good alternative could be the use of a table of moves.

"Search for pointers" can be done with some kind of dissassembler | parser at run time.
This can be difficult, specially if there are some inline data or thash inside the chain.
It's also very ineffective, thought one really knows exactly where are the pointers at write | complile time.
A better alternative could be to store that places to be changed in a special field inside the link, and call it reloc field.
(ex1: jz $+6+012345678h = 0F 84 78 56 34 12, has reloc value 2)



Inline data is also conflictive when his size excedees the content size of one link.
One can not pass a large string (ex: file path) directly to an API call, for example.
The use of a 'heap' (common one, not chained) is highly encouraged for 'global' variables.
Local variables can be put in a temporary hole made in the stack.

So, virus will have two parts:
- the code, chained
- the heap, initially with random values

And, thinking on a simple PE infector, infection process will be:
- copy the code part, permute the chain links, and rearrange pointers
- leave space for the heap

Heap is not chained, so initialized data can remain constant betwen generations, allowing virus recognition by some marks | patterns on it.
That's why heap can not be passed betwen generations.
Another good alternative could be a heap strong encrypted with different key on each generation.

One can put all the 'variables' in the heap, and leave only inline the information one generation has to pass to the next, (ex: host entry point).
Initialized data can turn in code with the trick "mov [ebp+variable_in_heap],data".


Collateral aspects

VirtualChain can provide a solid base to make viruses highly undetectable.
But it cann't go alone very far away.

Avoid infection marks.
Use target file size, target file creation time, etc... as alternatives to avoid reifection.
Also, a few reinfections are not a big problem.

Virtualchaining is only based on permutation; so virus copies have the same checksum.
Include some nonexecutable trash, inside the chain, at infection time, to avoid this.

Including nonexecutable trash in each link, specially at the end, before the next phisical link content, can make the code hard to disassemble.

Don't put anything in heap until needed. Erase it inmediately after the use.
Memory dumpers could capture that.

Unfortunately, virtualchained viruses are too large to hide inside a file without increasing his size.
File size monitors can easily discover them, unless sophisticated techniques are used.
Avoid infecting key programs, (ex. win.com, antivir.exe).

Also, at the end of the last section of a file it's the esiest place to put a virus, and the easiest place to discover it.

Study hard, and use techniques against emulation, like structured exception handling, and trash executable code generation.

Be aware virtualchained viruses could be recognized in future by his high "jmp" ratio.