Find the closest integer with same weight

1 February 2016

Introduction

The count of ones in binary representation of integer number is called the weight of that number. The following algorithm finds the closest integer with the same weight. For example, for 123 (0111 1011)₂, the closest integer number is 125 (0111 1101)₂.

Algorithm

We iterate through bits to find the first pair of consecutive bits that differ. Swapping positions of these two bits will give us the closest integer, either greater than, or lower than our argument.

In order to find exactly the lower or greater closest integer we must compare if the position of the first bit in pair is 0 or 1.

The order we swap bits (in case they differ) will give us:

  • (01)₂ -> (10)₂ - the closest greater integer, because '1' is moving toward MSB positions,
  • (10)₂ -> (01)₂ - the closest lower integer, beacuse '1' is moving toward LSB positions.

Computing functions

The following function will do the work:

func find_closest_integer(N uint64) uint64 {

    var pos uint
    for n := N; n > 0; n >>= 1 {
        if n&1 != n>>1&1 {
            return N ^ ((1 << pos) | (1 << (pos + 1)))
        }
        pos++
    }

    return N
}

Similarly, changing comparison statement to the following,

if n&1 < n>>1&1 {
...
}

gives the closest lower integer, and the other comparison statement,

if n&1 > n>>1&1 {
...
}

gives the closest greater integer.

Efficiency

The above algorithm is of O(n) complexity. Let's make sure of that. Let's show benchmark results when computing `the closest greater` and `the closest lower than integer` for following arguments: 0xFFFE, 0xFFFFFE, and 0xFFFFFFFE. Arguments are choosen on purpose to demonstrate O(n).

go test -bench=.|benchgraph 
? PASS
√ BenchmarkFindGreaterThan_FFFE-4        100000000            15.7 ns/op
√ BenchmarkFindGreaterThan_FFFFFE-4      50000000            27.2 ns/op
√ BenchmarkFindGreaterThan_FFFFFFFE-4    50000000            32.9 ns/op
√ BenchmarkFindLowerThan_FFFE-4          1000000000             2.76 ns/op
√ BenchmarkFindLowerThan_FFFFFE-4        500000000             3.03 ns/op
√ BenchmarkFindLowerThan_FFFFFFFE-4      1000000000             2.79 ns/op
? ok      golang-interview/5-primitive-types/5-4-find-closest-integer-same-weight    12.620s

Waiting for server response ...
=========================================

http://benchgraph.codingberg.com/6

=========================================

And now let's visualize it:

In case of computing `the closest greater integer` we have to iterate through much more bits than in other case - computing `the closest lower than integer`, where we need only single iteration through bits. With increasing number of bits (more iterations (n)), it takes more time to compute, which assure us of O(n) complexity.

Find code samples here github.com/CodingBerg/golang-interview.

By one coder

Reverse bits of 64-bit unsigned integer

7 September 2015

I have tested few algorithms for reversing bits of unsigned integer. The fastest one would be using a lookup table for smaller segments of long 64-bit number. It's of O(1) complexity too.

For the sake of testing we used two more algorithms, the classic one using reminder after division by 2, and the other one, which would be more brute alike algorithm because it's based on string manipulation, which ought to be the slowest.

Look at the benchmarks:

BenchmarkDivReverse-4       100000000            17.8 ns/op
BenchmarkBruteReverse-4      3000000           554 ns/op
BenchmarkLookupReverse-4    300000000             5.55 ns/op

Function LookupReverse uses lookup table:

func lookupreverse(n uint64) uint64 {
    var ret uint64

    for n > 0 {
        bits := n & 0xff
        n >>= 8

        ret <<= 8
        ret |= uint64(lTable[bits])
    }

    for ret&1 == 0 {
        ret >>= 1
    }

    return ret
}

The lookup table consists of 8-bit numbers. Maybe the above function better suits bigger numbers than 64-bit ones. It doesn't rely exactly on the fact that we have just eight 8-bit segments inside a 64-bit number, which could be coded without a loop. But, the loop goes through as much iterations it needs until there is no another 8-bit segment remaining in a number. For 64-bit number it is no more than fixed eight iterations, which is still of O(1) complexity.

Other two algorithms were useful in test functions. First, we have initialized test table using one algorithm and later we compared those results with other algorithms. It's hard that all algorithms give different results. Reliability of tests is based on having same results with different algorithms.

var reverseTests []reverseTest

func init() {
    for i := 0; i < 8196; i++ {
        number := uint64(rand.Int63())
        reverseTests = append(reverseTests, reverseTest{number, divreverse(number)})
    }
}

func TestBruteReverse(t *testing.T) {
    for _, tt := range reverseTests {
        actual := brutereverse(tt.n)
        if actual != tt.expected {
            t.Errorf("BruteReverese(%x) expected: %x actual: %x ", tt.n, tt.expected, actual)
        }
    }
}

func TestLookupReverse(t *testing.T) {
    for _, tt := range reverseTests {
        actual := lookupreverse(tt.n)
        if actual != tt.expected {
            t.Errorf("LookupReverese(%x) expected: %x actual: %x ", tt.n, tt.expected, actual)
        }
    }
}

Find code samples here github.com/CodingBerg/golang-interview.

By one coder

Swap bits of 64-bit unsigned integer

12 August 2015

In order to swap two bits at given position (i,j) we have to perform two operations:

  • Compare two bits at indicies (i,j) if they differ
  • Toggle values at indicies (i,j) if bits differ.

The above algorithm is of O(1) complexity.

The following function will do the work:

func swapbits(n uint64, i uint, j uint) uint64 {
    if n>>i&1 != n>>j&1 {
        n ^= (1 << i) | (1 << j)
    }
    return n
}

Find code samples here github.com/CodingBerg/golang-interview.

By one coder

Compute parity of 64-bit unsigned integer

21 July 2015

Introduction

I'll go different ways to compute parity of 64-bit unsigned integer. For parity computing we should first find out whether the count of ones in binary representation of number is even or odd. To analyze algorithm efficiency we'll use four different functions. The first function is easiest to write, but it's not very efficient. The last one takes longer to code, but it provides the best efficiency. Let's compare some benchmarks.

Computing functions

1) Function f1 converts the number to binary representation and counts '1' characters inside a string.

func f1(n uint64) int {
    bs := strconv.FormatUint(n, 2)
    return strings.Count(bs, "1")
}

2) Function f2 checks if last bit of a number is one and shifts number with every iteration until it's zero (no more ones). Loop stops when we find the last bit set.

func f2(n uint64) int {
    cnt := 0
    for n > 0 {
        if n&1 == 1 {
            cnt++
        }
        n = n >> 1
    }
    return cnt
}

3) Function f3 decreases number by one and performs AND operation with the original number on each iteration, until the resulting number is zero. There are as much iterations as there are ones. It clears bit by bit.

func f3(n uint64) int {
    cnt := 0
    for n > 0 {
        n &= (n - 1)
        cnt++
    }
    return cnt
}

4) Function f4 uses lookup table for the first 255 numbers. Every table entry contains total count of ones for corresponding number. The idea is to divide 64-bit integer into 8 unsigned bytes (numbers) and then the final count is calculated as a sum of ones found in each one of divided bytes. The procedure steps are always the same, independently of varying arguments.

var bitCounts = []int8{
    0, 1, 1, 2, 1, 2, 2, 3,
    1, 2, 2, 3, 2, 3, 3, 4,
    1, 2, 2, 3, 2, 3, 3, 4,
    2, 3, 3, 4, 3, 4, 4, 5,
    1, 2, 2, 3, 2, 3, 3, 4,
    2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5,
    3, 4, 4, 5, 4, 5, 5, 6,
    1, 2, 2, 3, 2, 3, 3, 4,
    2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5,
    3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5,
    3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6,
    4, 5, 5, 6, 5, 6, 6, 7,
    1, 2, 2, 3, 2, 3, 3, 4,
    2, 3, 3, 4, 3, 4, 4, 5,
    2, 3, 3, 4, 3, 4, 4, 5,
    3, 4, 4, 5, 4, 5, 5, 6,
    2, 3, 3, 4, 3, 4, 4, 5,
    3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6,
    4, 5, 5, 6, 5, 6, 6, 7,
    2, 3, 3, 4, 3, 4, 4, 5,
    3, 4, 4, 5, 4, 5, 5, 6,
    3, 4, 4, 5, 4, 5, 5, 6,
    4, 5, 5, 6, 5, 6, 6, 7,
    3, 4, 4, 5, 4, 5, 5, 6,
    4, 5, 5, 6, 5, 6, 6, 7,
    4, 5, 5, 6, 5, 6, 6, 7,
    5, 6, 6, 7, 6, 7, 7, 8,
}

func f4(n uint64) int {
    var cnt int8
    cnt += bitCounts[n&255]
    cnt += bitCounts[n>>8&255]
    cnt += bitCounts[n>>16&255]
    cnt += bitCounts[n>>24&255]
    cnt += bitCounts[n>>32&255]
    cnt += bitCounts[n>>40&255]
    cnt += bitCounts[n>>48&255]
    cnt += bitCounts[n>>56&255]
    return int(cnt)
} // end-of-f4

Benchmark

We run functions f1, f2, f3 and f4 with different size of integer arguments (0xF, 0xFF, 0xFFF, 0xFFFF, 0xFFFFF, 0xFFFFFF, 0xFFFFFFF, 0xFFFFFFFF). The number of ones of integer arguments varies from 8 to 64. We are interested in how different algorithm design reflects function execution time in case when we vary arguments (the number of ones to count).

After benchmarking we get these results:

go test -bench .|benchgraph -title="Graph1: Benchmark F(x) in ns/op" -obn="F2,F3,F4" -oba="F,FF,FFF,FFFF,FFFFF,FFFFFF,FFFFFFF,FFFFFFFF"

√ BenchmarkF1_F-4           20000000            54.1 ns/op
√ BenchmarkF1_FF-4          20000000            61.0 ns/op
√ BenchmarkF1_FFF-4         20000000            70.0 ns/op
√ BenchmarkF1_FFFF-4        20000000            78.9 ns/op
√ BenchmarkF1_FFFFF-4       20000000            90.6 ns/op
√ BenchmarkF1_FFFFFF-4      20000000            99.1 ns/op
√ BenchmarkF1_FFFFFFF-4     20000000           111 ns/op
√ BenchmarkF1_FFFFFFFF-4    10000000           137 ns/op
√ BenchmarkF2_F-4           500000000             3.57 ns/op
√ BenchmarkF2_FF-4          300000000             5.69 ns/op
√ BenchmarkF2_FFF-4         200000000             8.54 ns/op
√ BenchmarkF2_FFFF-4        100000000            10.5 ns/op
√ BenchmarkF2_FFFFF-4       100000000            12.9 ns/op
√ BenchmarkF2_FFFFFF-4      100000000            14.7 ns/op
√ BenchmarkF2_FFFFFFF-4     100000000            16.4 ns/op
√ BenchmarkF2_FFFFFFFF-4    100000000            18.4 ns/op
√ BenchmarkF3_F-4           500000000             3.47 ns/op
√ BenchmarkF3_FF-4          300000000             5.55 ns/op
√ BenchmarkF3_FFF-4         200000000             7.67 ns/op
√ BenchmarkF3_FFFF-4        200000000             9.86 ns/op
√ BenchmarkF3_FFFFF-4       100000000            12.8 ns/op
√ BenchmarkF3_FFFFFF-4      100000000            14.2 ns/op
√ BenchmarkF3_FFFFFFF-4     100000000            16.2 ns/op
√ BenchmarkF3_FFFFFFFF-4    100000000            18.3 ns/op
√ BenchmarkF4_F-4           300000000             5.54 ns/op
√ BenchmarkF4_FF-4          300000000             5.59 ns/op
√ BenchmarkF4_FFF-4         300000000             5.55 ns/op
√ BenchmarkF4_FFFF-4        300000000             5.54 ns/op
√ BenchmarkF4_FFFFF-4       300000000             5.54 ns/op
√ BenchmarkF4_FFFFFF-4      300000000             5.59 ns/op
√ BenchmarkF4_FFFFFFF-4     300000000             5.55 ns/op
√ BenchmarkF4_FFFFFFFF-4    300000000             5.53 ns/op

And for other set of arguments (0xF, 0xF0, 0xF00, 0xF000, 0xF0000, 0xF00000, 0xF000000, 0xF0000000), having same number of ones set on different positions:

go test -bench .|benchgraph -title="Graph2: Benchmark F(x) in ns/op" -obn="F2,F3,F4" -oba="0F,F0,F00,F000,F0000,F00000,F000000,F0000000"

√ BenchmarkF1_0F-4          30000000            53.6 ns/op
√ BenchmarkF1_F0-4          20000000            61.8 ns/op
√ BenchmarkF1_F00-4         20000000            70.4 ns/op
√ BenchmarkF1_F000-4        20000000            79.2 ns/op
√ BenchmarkF1_F0000-4       20000000            93.2 ns/op
√ BenchmarkF1_F00000-4      20000000           101 ns/op
√ BenchmarkF1_F000000-4     20000000           117 ns/op
√ BenchmarkF1_F0000000-4    10000000           129 ns/op
√ BenchmarkF2_0F-4          500000000             3.50 ns/op
√ BenchmarkF2_F0-4          200000000             7.91 ns/op
√ BenchmarkF2_F00-4         100000000            10.8 ns/op
√ BenchmarkF2_F000-4        100000000            14.1 ns/op
√ BenchmarkF2_F0000-4       100000000            17.2 ns/op
√ BenchmarkF2_F00000-4      50000000            24.2 ns/op
√ BenchmarkF2_F000000-4     50000000            26.5 ns/op
√ BenchmarkF2_F0000000-4    50000000            28.9 ns/op
√ BenchmarkF3_0F-4          500000000             3.48 ns/op
√ BenchmarkF3_F0-4          500000000             3.47 ns/op
√ BenchmarkF3_F00-4         500000000             3.50 ns/op
√ BenchmarkF3_F000-4        500000000             3.50 ns/op
√ BenchmarkF3_F0000-4       500000000             3.46 ns/op
√ BenchmarkF3_F00000-4      500000000             3.43 ns/op
√ BenchmarkF3_F000000-4     500000000             3.45 ns/op
√ BenchmarkF3_F0000000-4    500000000             3.52 ns/op
√ BenchmarkF4_0F-4          300000000             5.67 ns/op
√ BenchmarkF4_F0-4          300000000             5.62 ns/op
√ BenchmarkF4_F00-4         300000000             5.55 ns/op
√ BenchmarkF4_F000-4        300000000             5.61 ns/op
√ BenchmarkF4_F0000-4       300000000             5.57 ns/op
√ BenchmarkF4_F00000-4      300000000             5.54 ns/op
√ BenchmarkF4_F000000-4     300000000             5.54 ns/op
√ BenchmarkF4_F0000000-4    300000000             5.58 ns/op

We will ignore f1 benchmarks because the results are significantly worse than in case of f2, f3, and f4. Function f1 was just lazy written code.

Let's see at benchgraph visualization of benchmarking results:

Conclusion

From Graph1 above we conclude that functions f2 and f3 require more CPU time to complete as size of integer arguments increases (more ones). Function f4 implements algorithm of complexity O(1). No matter what is the size of argument, it takes approximately equal time to complete. It's like a dream for algorithm designers.

The interesting part is what happens if we take the same number of ones but on different positions. On Graph2 we can see that f2 depends on position of ones, while f3 and f4 don't. They execute in constant time. In some cases f3 is even faster than f4, but still it implements less efficiant algorithm of O(n).

Find code samples here github.com/CodingBerg/golang-interview.

For benchmark visualization we have used github.com/CodingBerg/benchgraph.

By one coder

See the index for more articles.