Wednesday, 23 July 2014

Dynamic programming - Heavy Numbers (Part II)

In my last blog entry, Dynamic programming - Heavy Numbers (Part I), a solution is presented to find how many heavy number given a range via dynamic programming. The key idea is to formulate the sub-problem and then solve it recursively. And a few optimization techniques are employed as well, for instance caching the result and using the pre-calculated results. Here I would like discuss a more direct way to tackle this problem. And this servers my last discussion on heavy number.

1. Analysis
Let's see what the heavy numbers share in common. Here is the list of all heavy numbers within [10, 99] (Sum of digits is 15, 16, 17, 18)
    - 69, 78, 79, 87, 88, 89, 96, 97, 98, 99
The let's see the combinations of heavy numbers.
    - 69, 96: {6, 9}
    - 78, 87: {7, 8}
    - 79, 97: {7, 9}
    - 88:       {8, 8}
    - 89, 98: {8, 9}
    - 99        {9, 9}
Seems to me that we can direct compute how many permutations of numbers if we can find all the possible combinations that meet the requirement of heavy number. For instance find out all the possible combination of {6, 9}, {7, 8}, {7, 9}, {8, 8}, {8, 9} and {9, 9}, then the permutations of the possible combinations are the answer and can be easily computed.

Let's re-arrange the possible combinations and take a close look at them.
    - 99      : {9, 9}
    - 98, 89: {9, 8}
    - 97, 79: {9, 7}
    - 96, 69: {9, 6}
    - 88      : {8, 8}
    - 87, 78: {8, 7}
Here are the all the heavy number should share:
    - At least one of digits should be > 7, namely 8, 9
    - A number consisting of only digits [0, 7] is not a heavy number
        * 0~7 is not heavy number
        * 66, 77 are not either
        * 666, 777 are not either
        ......
    - Narrow the range to search the combinations given n digits
        * 1 digit: [8, 9]
        * 2 digits: [87, 99]
        * 3 digits: [877, 999]
        * 4 digits: [8777, 9999]
        ......
    - Continue to narrow: each digit should be <= it direct left digit, except the left most digit.
        * 1 digit: [8, 9]
        * 2 digits: [87, 88], [90, 99]
        * 3 digits: [990, 999], [980, 988], [970, 977], [960, 966], [950, 955], [940, 944], ..., [900, 900]
                        [880, 888], [870, 877], ..., [810, 811], [800, 800]
         ......

2. Formulation of sub-problems
Based on the analysis on last section let's take a look at the sub-problem as I introduced in Dynamic programming - Heavy Numbers (Part I). Comparing with Section 2 on  Dynamic programming - Heavy Numbers (Part I) the only difference resides on "S sides", which is to reflect the narrowing space we are going to search.
    - S sides:
        * The top digit: only two sides - 9 and 8
        * The rest of digits: given its left digits value x (0 <= x <=9), it has (x+1) sides and they are  are 0, 1, 2, ..., x (not 0~9 any more)

It is implemented in function GetDigitsCombinatonsGivenSumOfNdigits. Please see Section 4. And of course sub-problem in this approach is to find out the possible combinations in the search space however in my last blog,  Dynamic programming - Heavy Numbers (Part I), it is to return how many heavy numbers. Another difference is that in this approach it is no way to  catch the previous result, because the narrowing search space is to guarantee that each combination will be unique.  

3. Calculate the permutation given a combination
Let's take a look at how to calculate how many  heavy numbers given a particular combination. The calculation will be a bit different if a combination has digit "0" within it. Because a number can not start with the digit "0".

Given a combination that none of its digits is "0".
    permutations = factorial(nDigits)/(factorial(repeatOfOne) * factorial(repeatOfTwo)* ...                                                                                     *factorial(repeatOfNine))
    where:
        - factorial(n) = n * (n - 1) * (n - 2) * ... * 2 * 1
        - "repeatOfX": occurrence of digit x, [1, 9] in this combination, can be 0, 1, ..., nDigits
        - Define factorial(0) = 1 for our computation

Given a combination that at least one of elements is "0".
The difference is that the total possible permutation will reduce. Assume that there are repeatOfZero "0". Then there are (nDigits - repeatOfZero) digits > 0.
Hence the total possible permutation will reduce
    from factorial(nDigits) to (nDigits - repeatOfZero) * factorial(nDigits - 1).
Therefore the formula can be rewritten as
    permutations = (nDigits - repeatOfZero) * factorial(nDigits - 1) /
                          (factorial(repeatOfZero) * factorial(repeatOfOne) * ... * factorial(repeatOfNine))

4. Implementation
The base class implementation of HeavyNumberImpPolynomial, please refer to my last blog, Dynamic programming - Heavy Numbers (Part I). HeavyNumberImpPermutation has a virtual implementation of function GetNumOfHeavyNumbersAgainstNdigits and shares the rest the API with its base class.

//********************************************************************************
// HeavyNumberImpPermutation.h
#ifndef HEAVY_NUMBER_IMP_PERMUTATION_H
#define HEAVY_NUMBER_IMP_PERMUTATION_H

#pragma once

#include "HeavyNumberImpPolynomial.h"

#include <deque>

class HeavyNumberImpPermutation : public HeavyNumberImpPolynomial
{
public:
HeavyNumberImpPermutation();
~HeavyNumberImpPermutation();

size_t GetNumOfHeavyNumbersAgainstNdigits(long nDigits);

private:
void GetDigitsCombinatonsGivenSumOfNdigits(long nDigits,
  long nSides,
  long sum,
  std::deque<int>& digits,
  std::vector<std::vector<int>>& digitsCombination);
size_t GetNumOfHeavyNumberGivenDigitsCombinations(long nDigits,
const std::vector<std::vector<int>>& digitsCombination);
double Factorial(int val);
};

#endif

// HeavyNumberImpPermutation.cxx
#include "stdafx.h"
#include "HeavyNumberImpPermutation.h"


HeavyNumberImpPermutation::HeavyNumberImpPermutation()
{
}


HeavyNumberImpPermutation::~HeavyNumberImpPermutation()
{
}


size_t HeavyNumberImpPermutation::GetNumOfHeavyNumbersAgainstNdigits(long nDigits)
{
if (nDigits < 0) {
return 0;
}

const size_t PRE_COMPUTED_NUM_OF_HV[9] = { 0, 2, 10, 56, 330, 2001, 12313, 76470,                                                                                                      478302 };
if (nDigits <= 8) {
return PRE_COMPUTED_NUM_OF_HV[nDigits];
}

const size_t heaviness = nDigits * AVERAGE_VALUE_OF_HEAVY_NUM + 1;
const long nSidesUpperBound = 9;
const size_t nSideLowerBound = 7;
std::deque<int> digits;
std::vector<std::vector<int>> digitsCombinations(10);
for (size_t topDigits = nSidesUpperBound; topDigits > nSideLowerBound; --topDigits) {
digits.clear();
digits.push_back(topDigits);
GetDigitsCombinatonsGivenSumOfNdigits(nDigits - 1, topDigits, heaviness - topDigits,
 digits, digitsCombinations);
}

return GetNumOfHeavyNumberGivenDigitsCombinations(nDigits, digitsCombinations);
}

void HeavyNumberImpPermutation::GetDigitsCombinatonsGivenSumOfNdigits(
long nDigits,
long nSides,
long sum,
std::deque<int>& digits,
std::vector<std::vector<int>>& digitsCombination)
{
if (sum > nSides * nDigits) {
return;
}

if (sum <= 0 && nDigits == 0) {
for (size_t index = 0; index < digitsCombination.size(); ++index) {
digitsCombination[index].push_back(0);
}
size_t size = digitsCombination[0].size();
for (size_t val, index = 0; index < digits.size(); ++index) {
val = digits[index];
++digitsCombination[val].at(size - 1);
}
return;
}

size_t count = 0;
for (long val = nSides; val >= 0; --val) {
digits.push_back(val);
GetDigitsCombinatonsGivenSumOfNdigits(nDigits - 1, val, sum - val, 
                               digits, digitsCombination);
digits.pop_back();
    }
}

size_t HeavyNumberImpPermutation::GetNumOfHeavyNumberGivenDigitsCombinations(
long nDigits,
const std::vector<std::vector<int>>& digitsCombination)
{
double factorailNDigits = Factorial(nDigits);
size_t numOfHM = 0;
double permutation;
for (size_t index = 0; index < digitsCombination[0].size(); ++index) {
permutation = digitsCombination[0].at(index) == 0?
factorailNDigits : (nDigits - digitsCombination[0].at(index)) * Factorial(nDigits - 1);
for (size_t digit = 0; digit < 10; ++digit) {
permutation /= Factorial(digitsCombination[digit].at(index));
}
numOfHM += permutation;
}

return numOfHM;
}

double HeavyNumberImpPermutation::Factorial(int val)
{
if (val <= 1) {
return 1.0;
}

double factorial = 1.0;
while (val) {
factorial *= val;
--val;
}

return factorial;
}

// Test
void Test() {
size_t numOfHN;
HeavyNumberImpPermutation hnipe;
numOfHN = hnipe.GetNumOfHeavyNumbersAgainstNdigits(1);
numOfHN = hnipe.GetNumOfHeavyNumbersAgainstNdigits(2);
numOfHN = hnipe.GetNumOfHeavyNumbersAgainstNdigits(3);
numOfHN = hnipe.GetNumOfHeavyNumbersAgainstNdigits(4);
numOfHN = hnipe.GetNumOfHeavyNumbersAgainstNdigits(5);
numOfHN = hnipe.GetNumOfHeavyNumbersAgainstNdigits(6);
numOfHN = hnipe.GetNumOfHeavyNumbersAgainstNdigits(7);
numOfHN = hnipe.GetNumOfHeavyNumbersAgainstNdigits(8);
numOfHN = hnipe.GetNumOfHeavyNumbersAgainstNdigits(9);
numOfHN = hnipe.GetNumOfHeavyNumbersAgainstNdigits(10);
}
//********************************************************************************

5. Summary
I did some very simple performance comparison between the two solutions in this blog and the last, on function GetNumOfHeavyNumbersAgainstNdigits. Lets's say the solution on last blog as Solution 1 and this blog as Solution 2. Here comes the result.
    Digits                     Solution 1 (millisecond)                Solution 2 (millisecond)
       9                                 0.000                                          0.001
      10                                0.002                                          0.002
      11                                0.006                                          0.003
      12                                0.011                                          0.005
      13                                0.025                                          0.009
      14                                0.042                                          0.011
      15                                0.069                                          0.017
      16                                0.111                                          0.025
(Environment: Microsoft Visual Studio Express 2013, Intel Core i3-3227U @1.9G, 6G RAM)

Solution 2 starts to outperform Solution 1 at digits 11. And when the number of digits increases, Solution 2 performs even better than Solution 1. I think the real advantage of Solution 2 is the narrowed search space. Even though the Solution 1 has caching facility, it impact might be balanced with it overhead as well.

These two solutions are both proven much much faster the simplest way - search the whole range, get sum of digits, calculate the average and decide if it is a heavy number. This proves very expensive,  and it takes 25.022 seconds and 311.951 seconds for 9 and 10 digits number respectively.

Bibliography:
[1] Donald E. Knuth "The Art of Computer Programming", Volume 1, Fundamental Algorithms, Third Edition
[2] Donald E. Knuth "The Art of Computer Programming", Volume 2, Seminumerical Algorithms, Third Edition

No comments:

Post a Comment