Matt Taylor

Home * People * Matt Taylor

Matt Taylor was involved in forum discussions on low level optimization and x86 assembly language issues. Beside other things Matt was busy to optimize a minimal perfect hashing scheme for bitscan purposes.

Based on ideas of Walter Faxon and De Bruijn Multiplication, Matt came up with a genius folding trick as a quintessence [1] [2]:

For 64-bits, I do things a little differently simply because a 64-bit multiply is slow. I start out with x ^= (x - 1) just like normal which generates a key equal to 2^n - 1 (where n is the index of the LSB, 1 is index 0). Now fold the 64-bit key into 32-bits by xoring the high 32-bits with the low 32-bits.

| ls1b | bb ^ (bb-1) | folded

63
0xffffffffffffffff0x00000000
62
0x7fffffffffffffff0x80000000
59
0x0fffffffffffffff0xf0000000
32
0x00000001ffffffff0xfffffffe
31
0x00000000ffffffff0xffffffff
30
0x000000007fffffff0x7fffffff
0
0x00000000000000010x00000001

Even if this folded “LS1B” contains multiple consecutive one-bits, the multiplication is De Bruijn like. There are only two magic 32-bit constants with the combined property of 32- and 64-bit De Bruijn Sequences to apply this minimal perfect hashing:


const int lsb_64_table[64] =
{
   63, 30,  3, 32, 59, 14, 11, 33,
   60, 24, 50,  9, 55, 19, 21, 34,
   61, 29,  2, 53, 51, 23, 41, 18,
   56, 28,  1, 43, 46, 27,  0, 35,
   62, 31, 58,  4,  5, 49, 54,  6,
   15, 52, 12, 40,  7, 42, 45, 16,
   25, 57, 48, 13, 10, 39,  8, 44,
   20, 47, 38, 22, 17, 37, 36, 26
};

/**
 * bitScanForward
 * @author Matt Taylor (2003)
 * @param bb bitboard to scan
 * @precondition bb != 0
 * @return index (0..63) of least significant one bit
 */
int bitScanForward(U64 bb) {
   unsigned int folded;
   assert (bb != 0);
   bb ^= bb - 1;
   folded = (int) bb ^ (bb >> 32);
   return lsb_64_table[folded * 0x78291ACF >> 26];
}

Forum Posts

Re: Bit magic by Matt Taylor, comp.lang.asm.x86, June 29, 2003

References

  1. Bit magic by Matt Taylor, comp.lang.asm.x86, June 26, 2003
  2. Re: Bit magic by Matt Taylor, comp.lang.asm.x86, June 29, 2003

Up one level