Fast Float to Int Assembler/Pentium+FPU

Converting a floating point number to integer is normally done like this: fistp [dword ptr temp] mov eax,[temp]
An alternative method is:DATASEG ALIGN 8 temp dq ? magic dd 59c00000h ; f.p. representation of 2^51 + 2^52 CODESEG fadd [magic] fstp [qword ptr temp] mov eax,[dword ptr temp]
Adding the 'magic number' of 2^51 + 2^52 has the effect that any integer between -2^31 and +2^31 will be aligned in the lower 32 bits when storing as a double precision floating point number. The result is the same as you get with FISTP for all rounding methods except truncation towards zero. The result is different from FISTP if the control word specifies truncation or in case of overflow. You may need a WAIT instruction for compatibility with other processors.
This method is not faster than using FISTP, but it gives better scheduling opportunities because there is a 3 clock void between FADD and FSTP which may be filled with other instrucions. You may multiply or divide the number by a power of 2 in the same operation by doing the opposite to the magic number. You may also add a constant by adding it to the magic number, which then has to be double precision.

The second thing is the inability of the FP unit to convert float to int internally at any reasonable speed (FRNDINT takes 19 cycles apparently). In some situations you could use:


        fistp   [qword mem]     ; 6 (estimated clock cycle count)
        fild    [qword mem]     ; 7-9

However accuracy will be sacrificed since a 64-bit integer can not represent as high values as floating point numbers can. You can also use the "magic" trick here, but you would get even less accuracy.
If you want to use the "magic" trick here just add a similar "magic" number and then subtract it away. Because of insufficient precisions the floating point value will change. This method has two drawbacks:

you must know the exact precision/accuracy of the FPU and if you don't know it, the method will fail.
depending on the rouding mode, you might get different results than with FRNDINT.

Another FRNDINT replacement:magic dd 59c00000h ; f.p. representation of 2^51 + 2^52 fadd [magic] ; 1-3 fsub [magic] ; 4-6
The through-put of the above code is only 2 clocks. Unfortunately there are situations where the results will not be the same than with FRNDINT. The lowest bit of the result may be wasted.

Gem writers: Vesa Karvonen
Agner Fog
last updated: 1998-03-16