Wednesday 25 February 2015

C++11 Features - Object Creation with "()", "=" and "{}"

1. Functions called underneath
Parenthesis "()" - call constructor
Equal assignment "=", what it calls depends on the left hand side
    - Left side is a declaration, then it calls constructor/copy-constructor
        class A;
        A a = 1; // A has a constructor takes an integer, then it calls constructor
        A b = a; // copy constructor
    - Left side is a variable, then it calls assignment operator
        A a = 1;
        A b = 2;
        b = a; // assignment operator
Brace "{}", it calls constructor
    - If initialization-list constructor is defined and matches the calling, then call initialization-list constructor
    - If initialization-list constructor is defined but does not matches the calling, then call normal constructor
    - If initialization-list is not defined, then call normal constructor

2. Three categories initialization
Default value of class member variables
Read this for more, C++11 features - Improvement on object construction.
class Foo {
    int mX = 0; //  Ok
    double mY{0.0}; //Ok
    char mZ('z'); // Not ok
};
In the initialization of default value of class member variables, both "=" and "{}" work but "()" does not. Eve thought initialization the member variables in this way is not recommended because of the maintenance problem if this class has a few other constructors. But it is good to use for the classes that do not have a user-defined constructor. And one advantage of this technique is that developers do not need to worry about the order of member appearing in the initialization list of a constructor. Normally the order has to comply with the order of the member variables declaration, otherwise warning is generated and some subtle errors could occur if dependency exists.

Initialization when declaration
When declare variables, initialize them with default value.
void Foo()
{
    int mX = 0; //  Ok
    double mY{0.0}; //Ok
    char mZ('z'); // Ok
}
All three work fine in this case. Just be careful using "{}" together with "auto". In C++11 it has the type of "std::initializer-list<T>".

Initialization of non-copyable types
Non-copyable types:
    - atomic variables in C++11
    - Any user-defined types that have private/deleted copy-constructor/assignment operator. Read this for more, C++11 features - Explicitly defaulted/deleted member functions
The following example is from Scott Meyer's book - Effective Modern C++
std::atomic<int> a{0}; // Ok
std::atomic<int> b(0); // Ok
std::atomic<int> c = 0; // Not ok
This a reasonable and logical consequence as we discussed in the first section, "=" is to call either copy-constructor or assignment operator, which are exactly suppressed in non-copyable classes.

Only "{}" work for all 3 initialization scenarios. Probably why it is called uniform initialization.

3. Pitfalls of braces "{}"
Braces "{}" is the only one that works for the 3 cases described in the second section. Besides braces "{}" fixes quite a few other things in C++98, (read C++11 - Initializer-list (Part I) for more)
    - Default value initialization of STL, like std::vector
    - Most vexing parsing problem

But it is not perfect either. It has overloading problems with other types. If a constructor with a initializer-list as the first augment exists, then compiler will always try to match it first then the rest, even thought there seems to be better choices. As long as the promotion can take place and match the constructor with initializer-list, then it will use it rather than any other, even though there might be better choices. Another issues is that it does not work with narrowing when comes to variable initialization. Read C++11 - Initializer-list (Part II).

Watch out the class with initializer-list constructor, using "()" or "{}" to initialize variables may call two different constructors. For instance std::vector. std::vector<int> a(10, 10) and std::<int> b{10, 10} are very different.

Bibliography:
[1] Scotts Meyer, "Effective Mordern C++", O'REILLY, 2014 First Release
[2] Bjarne Stroustrup, "The C++ Programming Language", C++11, 4th Edition
[3] C++ Reading Club @ IG Group

Tuesday 24 February 2015

Data Structure and Algorithm - Max the sum of elements times indices (Yahoo Interview Question)

1. Problem Description
This is an interview question from Yahoo for Software Engineer on careercup. Here is the original thread,



2. Analysis
Just as the owner of this thread described that the interviewer tried to hint that there is a better solution than his own initial. A sub-problem can be induce, which can be used in recursion or loop.

Assume given an array S = {S1, S2, S3, ..., Sn}
SUM = S1 + S2 + ... + Sn
F(1) = 1*S1 + 2*S2 + ... + n*Sn
// shift left
F(2) = n*S1 + 1*S2 + ... + (n-1)*Sn

F(2) - F(1) = (n-1)*S1 - S2 - ... - Sn
            = (n-1)*S1 - (S2 + S3 + ... + Sn)
            = n*S1 - (S1 + S2 + ... + Sn)
            = n*S1 - SUM
F(3) = (n-1)*S1 + n*S2 + 1*S3 + ... + (n-2)*Sn
F(3) - F(2) = (n-1)*S2 - (S1 + S3 + ... + Sn)
            = n*S2 - (S1 + S2 + S3 + ... + Sn)
            = nS2 - SUM

According to the induction theory,
F(n) - F(n-1) = n*Sn-1 - SUM

Ans this slution has coompuation complexity of O(N) and space complexity of O(1).

3. C++ Implementation
Non-recursive implementation
// ********************************************************************************
// IMPLEMENTATION
// ********************************************************************************
#include <vector>

size_t MaxSumOfElementsTimeIndicesViaRotation(const std::vector<size_t> input)
{
    size_t fPrev = 0;
    size_t sum = 0;
    size_t index = 1;
    for (auto x : input) {
        fPrev += x*index;
        sum += x;
        ++index;
    }

    size_t maxVal = fPrev;
    const size_t SizeOfInput = input.size();
    size_t fCur;
    for (index = 1; index < SizeOfInput; ++index) {
        fCur = fPrev + input[index - 1] * SizeOfInput - sum;
        if (fCur > maxVal) {
            maxVal = fCur;
        }
        fPrev = fCur;
    }

    return maxVal;
}
// ********************************************************************************
// TEST
// ********************************************************************************
#include <cassert>
void TestCases()
{
    {
        std::vector<size_t> input = { 1, 2, 3, 4, 5, 6, 7 };
        assert(MaxSumOfElementsTimeIndicesViaRotation(input) ==
            (1 * 1 + 2 * 2 + 3 * 3 + 4 * 4 + 5 * 5 + 6 * 6 + 7 * 7));
    }

    {
        std::vector<size_t> input = { 2, 3, 4, 5, 6, 7, 1 };
        assert(MaxSumOfElementsTimeIndicesViaRotation(input) ==
            (1 * 1 + 2 * 2 + 3 * 3 + 4 * 4 + 5 * 5 + 6 * 6 + 7 * 7));
    }

    {
        std::vector<size_t> input = { 3, 4, 5, 6, 7, 1, 2 };
        assert(MaxSumOfElementsTimeIndicesViaRotation(input) ==
            (1 * 1 + 2 * 2 + 3 * 3 + 4 * 4 + 5 * 5 + 6 * 6 + 7 * 7));
    }

    {
        std::vector<size_t> input = { 4, 5, 6, 7, 1, 2, 3 };
        assert(MaxSumOfElementsTimeIndicesViaRotation(input) ==
            (1 * 1 + 2 * 2 + 3 * 3 + 4 * 4 + 5 * 5 + 6 * 6 + 7 * 7));
    }

    {
        std::vector<size_t> input = { 5, 6, 7, 1, 2, 3, 4 };
        assert(MaxSumOfElementsTimeIndicesViaRotation(input) ==
            (1 * 1 + 2 * 2 + 3 * 3 + 4 * 4 + 5 * 5 + 6 * 6 + 7 * 7));
    }

    {
        std::vector<size_t> input = { 6, 7, 1, 2, 3, 4, 5 };
        assert(MaxSumOfElementsTimeIndicesViaRotation(input) ==
            (1 * 1 + 2 * 2 + 3 * 3 + 4 * 4 + 5 * 5 + 6 * 6 + 7 * 7));
    }

    {
        std::vector<size_t> input = { 7, 1, 2, 3, 4, 5, 6, };
        assert(MaxSumOfElementsTimeIndicesViaRotation(input) ==
            (1 * 1 + 2 * 2 + 3 * 3 + 4 * 4 + 5 * 5 + 6 * 6 + 7 * 7));
    }
}
// ********************************************************************************

Wednesday 18 February 2015

C++11 Features - noexcept

Continuing with my last two discussions about exceptions, Catching exception and C++11 Features - Exception: function-try-block, I would like to give exception another section, because it is so importance and troublesome. This tpoic is more or less a summary of Item 14 of Scott Meyer's book - Effective Modern C++, plus some good practice from my experience.

1. Syntax in C++98 and C++11
In C++98 "throw(...)" is the syntax
    - throw(...), for instance throw(std::exception): throws only std::exception plus any derived exception
    - throw(): do not throw
    - function without any notation as "void foo();": function foo() can throw anything or not throwing
As exception specification is part of function declaration. Compilers are vigorously checking the exception specification between the caller function and the callee funtion, the base virtual function and the derived virtual function. As this is so important, becuase
    - Throw an exception in non-throwing function in C++98 can cause program to terminate
    - Throw exceptions that is specified in exception specification can cause program to terminate
    - Non compatible exception specification in base and derived virtual function are actually 2 functions.

All these drawbacks and hassle makes C++98 exception specification rarely practical. And most of function declarations are ended up without any exception function, like void foo(), which can throw anything or not throw at all. Therefore this causes another issue - there is rarely good practice telling other developers if a function is throwing or not by simply reading the header file. I can image how many times exceptions are not caught and cause program to crash during QA.

In C++11, noexcept is introduced to replace throw(...).
As a recognition that the C++98 exception specification is not useful and practical, C++11 introduced noexcept to express a function that is either throwing or not at all. There is no "specific exception" that can be passed to throw(). In C++11 there are only two choices - either throw or not throw at all. Syntax follows,
    - void foo() noexcept;
    - void bar() noexcept(false);
    - by default noexcept is noexcept(true)
    - noexcept can take bool expression that can be evaluated at the compilation time. See the examples from Effective Modern C++.
    - function without exception specification has the same meaning as C++98, either throw anything or not throw at all.

As C++11 still upholds that exception specification is part of function declarations, The exception compatibility between caller and callee, and between base and derived virtual function are still being checked by compilers. But in C++11 exception specification has only two choices, hence there are only 4 combinations after cross-over. Compatibility is allowed that narrowing from callee to caller, from derived to base virtual.
    - Narrowing from noexcept(false) to noexcept(true) is not allowed
    - The rest combination is allowed

By the way C++98 syntax is still allowed in C++11 for back-compatibility.
 
2. Optimization
As Scott Meyer stated in his book that C++98 guarantees that in the no-throw exception sepcification the callee's stack is unwound to its caller's function and the program terminates when an exception specification is violated, for instance an exception leaves out a function with no throw exception specification. However C++11 does not guarantee the stack unwinding will definitely happen - it "possibly" happens. This means that C+98 generates code to guarantee that the runtime stack is in a unwindable state and make sure the objects in stack are destroyed in the reverse order. However C++11 does  not guarantee that. This means that C++11 does not need to generate the code to guarantee stack-unwinding when using noexcept/noexcept(true) and therefore possibly generate less and more efficient code.

Another potential performance boost is coming from move semantics. For instance used in standard template container and some standard algorithm like std::swap. Move semantics requires noexcept on object's destructor or some of its constructors.

3. Destructor
In C++98 it is regarded as common/good practice not to throw exceptions in destructor, because encountering another exception during in stack unwinding caused by an exception terminates the program. The destructor is where the objects release themselves and return resource back to system. All member variables' destructors are called in the reverse order of their construction. If any two of them throwing exceptions and allowing to leave their destructor, then the disaster described above will happen - the program will be brought down.

In C++11 this common practice has been brought up to a language rule - any memory deallocation and destructor are by default implicitly declared as "noexcept" no matter it is user-defined or generated by compiler. When all member variables have noexcept destructor, this class has noexcept destructor implicitly. If not all member variables have noexcept destrucotr, then this class does not have noexcept destructor either implicitly, because the exception specification can't narrow from noexcept(false) to noexcept(true). And the behavior is undefined if objects with noexcept(false) destructor are used with standard template container and standard algorithms.

At the meantime C++11 allows to specify destructor throwing exceptions with exception specification as "noexcept(false)", just as C++98 allows destructor to throw. (Never did this. One occasion reminded of this, when I was developing database related application. The warpper class of DB connection - throw exception if can't terminate the connection with the DB server in the destructor. But soon I abandoned this idea to use a Close() member function to terminate the DB connection and throw exception/oddity to system to notify the program and then have a non-throwing destructor.)

4. STL
A lot of C++11 features are introduced to improve the performance. And a lot of them are actually targeting at improving the performance of STL. One of key issues is that a lot of algorithms or functions of STL involves object copying/moving. Features like move semantics, rvalue-reference are introduced to solve this issue. With these features STL will employ move semantics rather than copy semantics if possible. When STL coming to decide use copy/move, one of principle is to check if this object's destructor and some of its constructors are exception free. More details please refer to Chapter 5 of Effective Modern C++.

5. Practical Use
Use noexcept where you can but do not overuse it. Just as Scott Meyer suggested in this book, declaring a API function as noexcept is a strong commitment. Declare them as noexcept if you'are absolutely sure and committed.

Rarely declare destructor as noexcept(false), because it could crash the problem. And keep in mind that do not use these objects with STL or algorithms.

One of good examples is to declare the functions of POD class (refer to Plain Old Data (POD) on C++11) as noexcept as many as possible. And define move operations as well if POD objects are used with STL. Engage with the potential performance boost

Use noexcept(...) with the functions that call the API functions that has exception specification. Keep the exception specification compatible and notify the users that this function may throw.

6. Thing Remaining Unsolved
In C++98 one of  issues that I am struggling most is how to write functions declaration to tell if this function is throwing exceptions.As we discussed above the exception specification is rarely practical in C++98. In last a few years I discussed with my colleagues many many occasions how to clearly express if a function is throwing. But not a time we can reach an agreement. We tried
    - Use documentation/comments: drawback is maintainability
    - Use different naming convention (with suffix like _throw) for the function throwing: drawback is that have to add suffix for on the functions of the entire calling graph, which just needs too much effort.

This issue remains in C++11, because functiosn without any exception specification can either throw or not throw. This uncertainty will definitely catch us and the program will crash because of uncaught exceptions. Mistakes can be easily made especially when there is no agreement to function declaration to distinguish functions throwing or not-throwing, In my personal experience I was caught quite a few times. Especially in a large code base or application involving the 3rd-party software there is more chances to be caught.

Google C++ Coding Style recommends not using exception at all. One of the reasons is mainly because of its legacy code. I don't agree with this. What I am suggesting follows,
    - Agree some rules to function declaration to tell throwing and non-throwing functions
    - Use exceptions only for user-action related error
    - Use error code for the rest
    - All user-defined exceptions derived from std::exception

Bibliography:
[1] Scotts Meyer, "Effective Mordern C++", O'REILLY, 2014 First Release
[2] Bjarne Stroustrup, "The C++ Programming Language", C++11, 4th Edition
[3] C++ Reading Club @ IG Group

Wednesday 11 February 2015

Dynamic Programming - Minimize the Number of Square Numbers Sum to N (Google Interview Question)

1. Problem Description
This is a Google Interview Questions for Software Developer/Engineer on careercup. Here is the original thread.

"

You are given a positive integer number N. Return the minimum number K, such that N can be represented as K integer squares.
Examples:
9 --> 1 (9 = 3^2)
8 --> 2 (8 = 4^2 + 4^2)
15 --> 4 (15 = 3^2 + 2^2 + 1^2 + 1^2)
First reach a solution without any assumptions, then you can improve it using this mathematical lemma: For any positive integer N, there is at least one representation of N as 4 or less squares.
"
2. Analysis
Obviously this is a minimization problem and its objective is to find the minimum K that a given number can be summed up by K square numbers. Let's consider the best case and the worse case of K.

Given a number N, the best case of K will be 1 as N is a square number, which means that there exists a integer x that meets the constraint, y= x*x, where y = N.

Then what is the worst case of K. The easiest one popping up in my mind is that K is equal to N, where all the square numbers are "1". N = 1*1 + 1*! + ... + 1*1. This is of course the theoretical worst case. Can we think of anything better than this worst case, as this a minimization problem. The better we can narrow down the worst case, the fast the search can go.
Here is one of case can be regards as the worst case.
    - K = 0
    - x = sqrt(N), N' = N - x*x and K increment to 1
    - If N' < 4, K = K + N'
        * If N' is equal to 3, 2, 1 or 0, then their K will be itself.
        * 3 = 1+1+1; 2 = 1+1; 1 = 1; 0 - N is a square number
        * return K (we call this K as K')
    - Else N = N' and repeat last 2 steps

Now let's think how bad this K' can be. Every time we divide N into two parts until N is less or equal to 3. Does this process ring a bell to you? Certainly to me this process reminds me of the process of binary search, keeping dividing into two groups until less than 2. But actually we are doing better than binary search, O(log2(N)), because each time we are picking up the smaller half.
Assume that N is not a square number and there must exist an integer i that meets i*i < N < (i+1)*(i+1), Each time we are picking up (N - i*i) that is less than (i+1)*(i+1) - i*i, because
    - N - i*i < (i+1)*(i+1) - i*i = 2*i + 1 ==> N - i*i < 2*i + 1
    - In the above we always pick N - i*i as the half we are going to repeat the steps
    - Let's compare the two halfs, one is i*i and another is N-(i*i) < 2*i + 1
    - Let's compare with the first half i*i even with the bigger value 2*i + 1 (> N - i*i).
    - So when will i*i be less or equal to 2*i + 1 ==> i*i <= 2*i + 1
    - i*i - 2*i + 1 <= 1 + 1 ==> (i-1)^2 <= 2
    - In the positive range only when i < 1 + root(2) < 3 ==> i < 3, we have i*i < 2*i+1
    - So as long as i >=3 or N > 9, then i*i is more than 2*i + 1.
    - Therefore as long as N > 9, i*i is always larger than N - i*i
    - It means that the smaller half is always picked for the next round.
    - Now we can conclude than K' is better  than ~ log2(N)

Here we have successfully narrowed down the worst case that can be, from N to K'. What is this telling us? Or how will this piece of information help the computation complexity. It actually means that we successfully narrow down the search depth from [1, N] to [1, K']. Any search of K that is higher than (K' - 1), we can ignore it. Besides we can regard this as a recursive process. As long as a better K is found (<K'), we could update K' with the best finding so far and stop any search which is higher than (K' - 1) in the process.

Then what does this K' mean in the search space? It means that the search depth (search breadth and search depth I assume everyone is familiar with). For instance let's illustrate the worst case N,
    - K = 0
    - j = 1, N' = N - j*j = N - 1 and K increment to 1, search depth = 1
    - j = 1 again, N' = N - 1 - j*j = N - 2 and K increment to 2 and search depth = 2
    ......
    - j = 1 again, until N' = N - K + 1' and K increment to K' - 1 and search depth = K' - 1
    - Now can stop this search because
        * If N' is a square number, then update K' = K' - 1
        * If N' is not a square number, not matter what happens K will be no better than K'

Of course here we considered the worst case. The other search path is similar. K is incremented as each step depth increment.Stop search when it reaches K' - 1 and update K' if the search depth is less than it and it is at the point of leaf node. At the same time the search breadth can be optimized as well. When consider j with in [1, i], do we really need to consider j = 1? Not really. The search in the first half of j within [1, i/2) is just repeating the search path of j within [i/2, i].

3. Complexity
Given a number N, it takes constant time if it is a square number. This is the best case that takes O(1) computation and whose search depth is equal to 1. Let's consider a non-square number. There exits a number i that meets i*i < N < (i+1)*(i+1), where
i is equal to floor(sqrt(N)).

Base on the above analysis
    - depth = 0; j within [i/2, i], the worst case of
        N' = N - (i/2)*(i/2)
            < (i+1)*(i+i) - i*i/4  = 3/4*(i*i) + 2*i + 1
                                             = 3/4(i*i + 2*i + 1) + i/2 + 1/4
                                             = 3/4(i+1)*(i+1) + i/2 + 1/4
        Therefor the worse case of N', meets 3/4*(i*i) < N' < 3/4*(i+1)*(i+1) + i/2 +3/4
        The worse case N' is a scalar times N.
    - depth = 1, the worst case N' is scalar times N. The search breath is still ~ N^(1/2)
    - The search depth is limited to K', therefore the computation complexity is O(N^(1/2*(K'-1)))
       should be better than O(N^(1/2*(log2(N)))) theoretically.
In the case of Lemma, K' = 4, then the computation complexity is O(N^(1.5)) and the space complexity is O(1).


4. C++ Implementation
// ********************************************************************************
// IMPLEMENTATION
// ********************************************************************************
// header file
#pragma once

class MinNumOfSquareOp
{
public:
    MinNumOfSquareOp();
    ~MinNumOfSquareOp();

    size_t operator()(const size_t x) const;

private:
    size_t FindTheSearchDepth_DP(const size_t x) const;
    void FindSolutionNotWorseThanK_DP(const size_t x, size_t depth, size_t& k) const;
};

// cpp file
#include "MinNumOfSquareOp.h"

#include <cmath>

MinNumOfSquareOp::MinNumOfSquareOp()
{
}


MinNumOfSquareOp::~MinNumOfSquareOp()
{
}

size_t MinNumOfSquareOp::FindTheSearchDepth_DP(size_t x) const
{
    if (x <= 3) {
        return x;
    }

    const size_t root = sqrt(x);
    return 1 + FindTheSearchDepth_DP(x - root*root);
}

void MinNumOfSquareOp::FindSolutionNotWorseThanK_DP(const size_t x,
                                                    size_t depth,
                                                    size_t& k) const
{
    if (depth >= k) {
        return;
    }

    if (x <= 3) {
        if ((depth + x) < k) {
            k = depth + x;
        }
        return;
    }

    const size_t root = sqrt(x);
    const size_t halfOfRoot = root >> 1;
    for (size_t val = root; val >= halfOfRoot; --val) {
        FindSolutionNotWorseThanK_DP(x - val*val, depth + 1, k);
        if (k == 2) {
            return;
        }
    }
}

size_t MinNumOfSquareOp::operator()(size_t x) const
{
    //size_t k = FindTheSearchDepth_DP(x);
    size_t k = 4;
    if (k <= 2) {
        return k;
    }

    FindSolutionNotWorseThanK_DP(x, 0, k);
    return k;
}

// ********************************************************************************
// TEST
// ********************************************************************************
#include "MinNumOfSquareOp.h"

#include <cassert>
void TestCases()
{
    {
        MinNumOfSquareOp mns;
        assert(mns(0) == 0);
        assert(mns(1) == 1);
        assert(mns(2) == 2);
        assert(mns(3) == 3);
        assert(mns(4) == 1);
        assert(mns(5) == 2);
        assert(mns(6) == 3);
        assert(mns(7) == 4);
        assert(mns(8) == 2);
        assert(mns(9) == 1);
        assert(mns(10) == 2);
        assert(mns(11) == 3);
        assert(mns(12) == 3);
        assert(mns(13) == 2);
        assert(mns(14) == 3);
        assert(mns(15) == 4);
        assert(mns(16) == 1);
        assert(mns(17) == 2);
        assert(mns(18) == 2);
        assert(mns(19) == 3);
        assert(mns(20) == 2);
        assert(mns(21) == 3);
        assert(mns(22) == 3);
        assert(mns(23) == 4);
        assert(mns(24) == 3);
        assert(mns(25) == 1);
        assert(mns(103) == 4);
    }
}
// ********************************************************************************

Monday 2 February 2015

Dynamic Programming - List All Sub-sets with Sum Equal to Zero II (Amazon Interview Question)

1. Motivation
As I discussed in my last blog, Dynamic Programming - List All Sub-sets with Sum Equal to Zero (Amazon Interview Question), the solution has exponential computation complexity and linear space complexity given to the size of the set. It is absolutely fine for the space complexity but still it is not acceptable for computation complexity. When the size of the set is over 30, the computation will be getting too much to compute.

This blog is to examine if there is any heuristic rules to guide us to search the whole solution space. For instance there is no need to go through all the exponential combinations if all the elements of the set are either negative or positive.

2. Problem Analysis
Let's say that the given set S, {X1, X2, ..., Xm}. the problem is to find all the sub-sets of S that has the sum of all elements equal to zero.
Problem statement:
    lambda1*X1 + lambda2*X2 + ... + lambda(m)*Xm = 0, where
        - X1, X2, ..., Xm are known from given set S
        - lambda1, lambda2, lambda(m) are unknown. they have two option, either 0 or 1
           lambad(i) = 0 or 1, i is within [1, M]

Because of lack of knowledge of the elements of given set, the solution on Dynamic Programming - List All Sub-sets with Sum Equal to Zero (Amazon Interview Question) can't predict which direction the sum of sub-set so far will go higher or lower if taking the next element in. (The next element could be negative, zero or positive. It means that taking this element into the sub-set could cause the sum to become less, unchanged or higher.) Therefore we have to go through all the rest of possible combinations of sub-set to find out if possible solution/sub-set exists.

Here I would like to do something about the given set in order that we can predict which direction the sum of the sub-set will go if taking the next element in. It could be potentially valuable to guide the search. What I going to do is to sort the given set and split the ordered set into 3 portions as,
    - Set Sn, with all negative elements and sorted, {X1, X2, ..., Xn}. The size of set is N.
    - Set Sz, with all zero elements {Z1, Z2, ..., Zo}, The size of set is O,
      and  Zi = 0, i within [1, O]
    - Set Sp with all positive elements and sorted, {Y1, Y2, ...., Yp}. The size of set is P
    - N, O and P are within [0, M] and have to meet N + O + P = M

Now let's take another look the problem statement. For elements in Sz we just don't care about them because they can either appear or not appear in the sub-set and their appearance does not affect the sum of the sub-sets. Now let's just focus on the non-zero elements, simply the negative and positive sets, Sn and Sp. In order to make the sum of elements of the sub-set equal to zero the sub-set must contain negative and positive elements. Now we can restate the problem.
Revised problem statement:
    lambda1*X1 + lambda*X2 + ... + lambda(n)*Xn = theta1*Y1 + ... theta2*Y2 + ... + theta(p)* Yp
        - where lambda(i) = 0 or -1 and Sigma(lambda(i)) != 0, where i within [1, N]
        - where theta(j) = 0 or 1 and Sigma(theta(j)) != 0, where j within [1, P]

What we do now is to take a sub-set of Sn (with certain combination of lambda) with sum as SUM. And then try to find the sub-set in Sp that have the sum equal to the absolute value of SUM. Bear in mind that the Sn and Sp are both sorted. When coming to find the sub-set in Sp given to SUM. If the sub-set in Sp we are holding now has sum of elements more than SUM, then we can stop searching the rest combination because Sp has sorted positive elements and taking any extra element will only make this sum of sub-set higher. These rules could be valuable to find the solutions. Here is the list of rules that I can think of,
Heuristic rules:
    - If N or P is equal to zero, then no need to do the search.
    - If O is more than 0, then it simply expands the number of sub-sets by time of (2^N - 1)
        * If no solution of sub-set with non-zero elements found, the output will be sub-sets
           that are made of only zero elements. Total will be 2^O -1 sub-sets.
        * If found solution made of Sn and Sp and assume the number of solutions is NUM, then
           arbitrary number of zero elements can be added as solutions as well. Total will be
           NUM*(2^N) sub-sets.
    - Stop search if the current sum of sub-set of Sp is more than the absolute value of SUM.

3. Solution Complexity
The heuristic rules will certainly help speeding up the searching. But it will not help the worst case. The worst case will be
    - O = 0
    - Both N and P are > 0 and < M
    - N + P = M
Therefore the total number of sub-sets of Sn will be (2^N-1) and the total number of sub-sets of Sp will be (2^P - 1). All possible combinations will be
    - (2^N - 1)*(2^P - 1) = 2^(N+P) - 2^N - 2^P + 1 = 2^M - 2^N - 2^P + 1
which is better than (2^M - 1) but not enough to swing big-O notation, still is an exponential computation complexity given to the size of set S. And the space complexity stays unchanged - linear complexity given to the size of set S.

4. C++ Implementation
- The implementation only prints out the sub-sets of non-zero elements.
- Arbitrary number of zero elements can be added into non-zero sub-sets above as solutions as well
- The count returned includes the sub-sets with/without zero elements.

// ********************************************************************************
// IMPLEMENTATION
// ********************************************************************************
// Header Set.h
#pragma once

#include <vector>

class Set
{
public:
    using SET = std::vector<int>;
    using INDICE = std::vector<size_t>;

    Set(const SET& input);
    ~Set();

    size_t PrintSubSetOfSummOfAllElementsEqualZero_Heuristic();
private:
    void SplitInputData();
    size_t PrintSubSetOfSummOfAllElementsEqualZero_Heuristic_DP(
               const size_t curNegativeIndex, bool alreadyCounted);
    size_t FindSubsetWithPostitiveElements_Heuristic();
    size_t FindSubsetWithPostitiveElements_Heuristic_DP(const size_t curPositiveIndex);

    void PrintSubset();

    const SET& mInputRef;
    SET mSortedZeroElements;
    SET mSortedNegativeElements;
    SET mSortedPositiveElements;

    int mSumOfAllPositiveElements;
    size_t SIZE_OF_POSITIVE;
    size_t SIZE_OF_NEGATIVE;

    INDICE mNegativeSubsetIndices;
    int mNegativeSum;
    size_t mSizeOfNegativeSubsetIndices;
    INDICE mPositiveSubsetIndices;
    int mPositiveSum;
    size_t mSizeOfPositiveSubsetIndices;
};

// Cpp file Set.cpp
#include "Set.h"

#include <algorithm>
#include <iostream>
#include <numeric>

#include <cstdlib>

Set::Set(const std::vector<int>& input)
: mInputRef(input)
{
    SplitInputData();
}

Set::~Set()
{
}

size_t Set::PrintSubSetOfSummOfAllElementsEqualZero_Heuristic()
{
    SIZE_OF_NEGATIVE = mSortedNegativeElements.size();
    size_t count = 0;

    if (SIZE_OF_NEGATIVE > 0) {
        SIZE_OF_POSITIVE = mSortedPositiveElements.size();
        if (SIZE_OF_POSITIVE > 0) {
            const int sumOfNegative = std::accumulate(mSortedNegativeElements.begin(),
                        mSortedNegativeElements.end(), 0);
            if (abs(sumOfNegative) >= *mSortedPositiveElements.begin())
            {
                mSumOfAllPositiveElements = std::accumulate(mSortedPositiveElements.begin(),
                        mSortedPositiveElements.end(), 0);
                if (mSumOfAllPositiveElements >= abs(*mSortedNegativeElements.rbegin()))
                {
                    mNegativeSubsetIndices.assign(SIZE_OF_NEGATIVE, 0);
                    mPositiveSubsetIndices.assign(SIZE_OF_POSITIVE, 0);

                    for (size_t index = 0; index < SIZE_OF_NEGATIVE; ++index) {
                        mNegativeSubsetIndices[0] = index;
                        mNegativeSum = mSortedNegativeElements[index];
                        mSizeOfNegativeSubsetIndices = 1;
                        count += PrintSubSetOfSummOfAllElementsEqualZero_Heuristic_DP(
                                                    index + 1, false);
                    }
                }
            }
        }
    }
    const size_t SIZE_OF_ZERO = mSortedZeroElements.size();
    if (count > 0) {
        return count << SIZE_OF_ZERO;
    }
   
    if (SIZE_OF_ZERO > 0) {
        return (1 << SIZE_OF_ZERO) - 1;
    }

    return 0;
}

size_t Set::PrintSubSetOfSummOfAllElementsEqualZero_Heuristic_DP(
                     const size_t curNegativeIndex, bool alreadyCounted)
{
    size_t count = 0;
    if (!alreadyCounted) {
        count += FindSubsetWithPostitiveElements_Heuristic();
        alreadyCounted = true;
    }

    if (curNegativeIndex < SIZE_OF_NEGATIVE) {
        count += PrintSubSetOfSummOfAllElementsEqualZero_Heuristic_DP(
                        curNegativeIndex + 1, alreadyCounted);

        mNegativeSubsetIndices[mSizeOfNegativeSubsetIndices] = curNegativeIndex;
        mNegativeSum += mSortedNegativeElements[curNegativeIndex];
        ++mSizeOfNegativeSubsetIndices;
        count += PrintSubSetOfSummOfAllElementsEqualZero_Heuristic_DP(
                        curNegativeIndex + 1, false);
        mNegativeSum -= mSortedNegativeElements[curNegativeIndex];
        --mSizeOfNegativeSubsetIndices;
    }
    return count;
}

size_t Set::FindSubsetWithPostitiveElements_Heuristic()
{
    size_t count = 0;
    for (size_t index = 0; index < SIZE_OF_POSITIVE; ++index) {
        mPositiveSubsetIndices[0] = index;
        mPositiveSum = mSortedPositiveElements[index];
        mSizeOfPositiveSubsetIndices = 1;
        count += FindSubsetWithPostitiveElements_Heuristic_DP(index + 1);
    }

    return count;
}

size_t Set::FindSubsetWithPostitiveElements_Heuristic_DP(const size_t curPositiveIndex)
{
    if ((mNegativeSum + mSumOfAllPositiveElements) < 0) {
        return 0;
    }
   
    int result = mNegativeSum + mPositiveSum;
    if (result == 0) {
        PrintSubset();
        return 1;
    }

    if (result > 0) {
        return 0;
    }

    size_t count = 0;
    if (curPositiveIndex < SIZE_OF_POSITIVE) {
        count += FindSubsetWithPostitiveElements_Heuristic_DP(curPositiveIndex + 1);

        mPositiveSubsetIndices[mSizeOfPositiveSubsetIndices] = curPositiveIndex;
        mPositiveSum += mSortedPositiveElements[curPositiveIndex];
        ++mSizeOfPositiveSubsetIndices;
        count += FindSubsetWithPostitiveElements_Heuristic_DP(curPositiveIndex + 1);
        mPositiveSum -= mSortedPositiveElements[curPositiveIndex];
        --mSizeOfPositiveSubsetIndices;
    }

    return count;
}

void Set::SplitInputData()
{
    SET inputCopy(mInputRef);
    const auto SIZE_TO_RESERVE = inputCopy.size();
    mSortedNegativeElements.reserve(SIZE_TO_RESERVE);
    mSortedZeroElements.reserve(SIZE_TO_RESERVE);
    mSortedPositiveElements.reserve(SIZE_TO_RESERVE);

    std::sort(inputCopy.begin(), inputCopy.end());
    auto firstZeroIt = std::lower_bound(inputCopy.begin(), inputCopy.end(), 0);
    auto lastZeroIt = std::upper_bound(inputCopy.begin(), inputCopy.end(), 0);
    mSortedNegativeElements.assign(inputCopy.begin(), firstZeroIt);
    mSortedZeroElements.assign(firstZeroIt, lastZeroIt);
    mSortedPositiveElements.assign(lastZeroIt, inputCopy.end());
}

void Set::PrintSubset()
{
    std::cout << "{ ";
    for (size_t idx = 0; idx < mSizeOfNegativeSubsetIndices; ++idx) {
        std::cout << mSortedNegativeElements[mNegativeSubsetIndices[idx]] << ", ";
    }

    for (size_t idx = 0; idx < mSizeOfPositiveSubsetIndices; ++idx) {
        if ((idx + 1) != mSizeOfPositiveSubsetIndices) {
            std::cout << mSortedPositiveElements[mPositiveSubsetIndices[idx]] << ", ";
        }
        else {
            std::cout << mSortedPositiveElements[mPositiveSubsetIndices[idx]];
        }
    }
    std::cout << " }" << std::endl;
}

// ********************************************************************************
// TEST
// ********************************************************************************
#include "Set.h"
void TestPrintSubSetOfSummOfAllElementsEqualZero_Heuristic()
{
    {
        const SET testSet = { 1, 3, 5, 7};
        assert(Set(testSet).PrintSubSetOfSummOfAllElementsEqualZero_Heuristic() == 0);
    }

    {
        const SET testSet = { -1, -3, -5, -7 };
        assert(Set(testSet).PrintSubSetOfSummOfAllElementsEqualZero_Heuristic() == 0);
    }

    {
        const SET testSet = { -7, -8, -9, -10, 1, 2, 3 };
        assert(Set(testSet).PrintSubSetOfSummOfAllElementsEqualZero_Heuristic() == 0);
    }

    {
        const SET testSet = { 7, 8, 9, 10, -1, -2, -3 };
        assert(Set(testSet).PrintSubSetOfSummOfAllElementsEqualZero_Heuristic() == 0);
    }

    {
        const SET testSet = { 0, 0, 0, 0};
        assert(Set(testSet).PrintSubSetOfSummOfAllElementsEqualZero_Heuristic() == 15);
    }

    {
        const SET testSet = { 5, 3, 1, 8, -8, -4 };
        assert(Set(testSet).PrintSubSetOfSummOfAllElementsEqualZero_Heuristic() == 4);
    }

    {
        const SET testSet = { 0, 0, 0, 0 };
        assert(Set(testSet).PrintSubSetOfSummOfAllElementsEqualZero_Heuristic() == 15);
    }

    {
        const SET testSet = { -1, 1 };
        assert(Set(testSet).PrintSubSetOfSummOfAllElementsEqualZero_Heuristic() == 1);
    }

    {
        const SET testSet = { -1, 1, -3, 3 };
        assert(Set(testSet).PrintSubSetOfSummOfAllElementsEqualZero_Heuristic() == 3);
    }

    {
        const SET testSet = { -1, 1, -3, 3, -4, 4 };
        assert(Set(testSet).PrintSubSetOfSummOfAllElementsEqualZero_Heuristic() == 9);
    }

    {
        const SET testSet = { -1, 1, -2, 2, -3, 3, -4, 4 };
        assert(Set(testSet).PrintSubSetOfSummOfAllElementsEqualZero_Heuristic() == 25);
    }

    {
        const SET testSet = { -1, 1, -2, 2, -3, 3, -4, 4, -5, 5 };
        assert(Set(testSet).PrintSubSetOfSummOfAllElementsEqualZero_Heuristic() == 75);
    }
}
// ********************************************************************************