On the 486, there is an anomaly in the instruction latencies. Note the following:
SHR/SAR/SHL reg,1
(opcodes D0, D1
) have latency 3SHR/SAR/SHL rem,imm8
(opcodes C0, C1
) have latency 2DB
"imm8"
can hold shift counts of 0 through 255. So, we can code a shift count of 33 to get an effective shift count of 1. To make this a little bit more readable to the casual code reader, who might not realize right away that we are really shifting by 1, me might do something like this:FASTSHIFT EQU 32 SHR reg, 1+FASTSHIFT SHL reg, 1+FASTSHIFT SAR reg, 1+FASTSHIFTSide note:
SHL reg,1
should be replaced by the faster ADD reg,reg
in all cases. However for rightshifts this gem is indeed usefull.