All CPUs have an ability to exchange values via the xchg
instruction. This instruction is not allways the most optimal. It is the smallest but not the fastest - the instructions is not pairable and executes in 3 cycles. On the Pentium the following combination is faster:
push ax push bx pop ax pop bxNow
AX,BX
are swapped. This gem uses 2 cycles instead of 3 but are bigger. There is yet another way. It is equally fast on a 486 and may be faster on a Pentium if your code is interleaved properly. It is bigger than the above example:xor ax,bx xor bx,ax xor ax,bxNow
AX,BX
are once again swapped.