Monday 9 June 2014

C++ - Evaluation Order

1. Intruduction
Evaluation order is referring to the evaluation sequence of function arguments or operator arguments. It should not be confused with the left-to-right/right-to-left associativity of operators. And it should not be confused with operator precedence either.

Example 1: concepts
//**********************************************************
double Root(double);
double Square(double);
double Cubic(doubel);

void Foo() {
    double x = Root(2)*Square(4)*Cubic(5); // expression 1
    double y = Root(2)*Square(4)+Cubic(5); // expression 2
    // .......
}
//**********************************************************

Evaluation order: the evaluation sequence of functions in the expression - which function Root(2), Square(4) or Cubic(5) is evaluated first, second and last, as shown in expression 1 and 2 in Example 1.
Associativity of operators: multiply the left two first or the right two first in expression 1 in Example 1.
Operator precedence: do the multiplication first then addition in expression 2 in Example 1.
For more details about these concepts please refer to [1], [2] and [3].

An evaluation include two parts, value evaluation and side effect. [2] provides a very good definition about these two concepts. In a simple word value evaluation is referring to what the function/operator implementation should bring to you, for example returning a value after executing the function/operator. And side effect refers to whether the state of the program/thread is changed between before and after the function/operator is called. For instance, modify the global value, manipulate the IO, modify the pass-in arguments and so on.

2 Summary of rules
For all the rules please read the Chapter 5 in C++ standard [1]. Besides [2] also provides very good summary. Here I would like reiterate what I think is important or bug-prone in my daily job.
    - The evaluation order of function arguments is unspecified. The order of calling the function this time could be different from calling the function the next time.
    - The evaluation order of operator arguments is unspecified too. The same as function arguments.
    - Exception for logic operator "&&" and "||". The evaluation order is specified, in the left-to-right order.
    - Operator "?:", the first expression is evaluated and its side effect takes into force before the second and third.
    - Use initializer-list, {}-list, to initialize the object to guarantee the evaluation order, in the left-to-right order element by element.

3. Examples/Bugs in my previous projects
What is the fuss about evaluation order is the side effect. If the evaluation order is not defined and each evaluation brings side effect, then the overall value evaluation and side-effect could be different after the accumulation of value evaluation and side effect of the whole sequence of evaluations. Most of time the programmers will pay enough attention to side effect on global values and IOs when designing the software structure, because they are more noticeable in the software design process. Normally some guard and sync are put in place among them. However, in my experience, the bugs happen mostly when no enough attention paid to side effect on pass-in arguments.

Undefined evaluation order on function/operator calling
C++03 standard says that the evaluation order of arguments of functions/operators are not specified. It means that the vendor of compiler decides it. And C++11 has no changes on this clause.

// Example 2: evaluation order on argument of functions/operators
//********************************************************************************
int IncrementByTwo(int& x) {
    x += 2;
return x;
}

int IncrementByFour(int& x) {
    x += 4;
return x;
}

int IncrementByEight(int& x) {
x += 8;
return x;
}

int SumXYZ(int x, int y, int z) {
    std :: Couto << "x =" << x << std :: endl;
std::cout << "y = " << y << std::endl;
std::cout << "z = " << z << std::endl;
return x + y + z;
}

void Foo() {
    int eoInitVal = { 0 };
std::cout << IncrementByTwo(eoInitVal) << " "
                 << IncrementByFour(eoInitVal) << " "
                 << IncrementByEight(eoInitVal) << " "
                 << std::endl;
eoInitVal = { 0 };
SumXYZ(IncrementByTwo(eoInitVal),
                    IncrementByFour(eoInitVal),
                    IncrementByEight(eoInitVal));
}

// the output of Foo()
/*
14 12 8
x = 14
y = 12
z = 8
*/
//********************************************************************************

Problem happens when the arguments of function/operator are made up of the return value of other function/operator calls that have side effect on the same variables/objects/IOs. One of examples is the "<<"/">>" operator in std streams. As shown in Example 2 the output of std::cout in Foo() is not what we expect, "2, 6, 14". In VC12 (Microsoft Visual Studio Express 2013), the evaluation order of arguments of functions/operators is in the backwards order, from the last argument to the first.

// Example 3: fix of evaluation order on arguments of functions/operator
//********************************************************************************
void Foo() {
    int eoInitVal = { 0 };
    int two = IncrementByTwo(eoInitVal);
    int six = IncrementByFour(eoInitVal);
    int fourteen = IncrementByEight(eoInitVal);
std :: Couto << two << "" << six << "" << fourteen << "" << std :: endl;
eoInitVal = { 0 };
    two = IncrementByTwo(eoInitVal);
    six = IncrementByFour(eoInitVal);
    fourteen = IncrementByEight(eoInitVal);
SumXYZ(two, six, fourteen);
}
//********************************************************************************

Really there are not too many choices. One of them is to write separated expressions to guarantee the evaluation order to have the side effect that you prefer. (Keep in mind that separated expressions guarantee the evaluation order. The expression is guaranteed to be evaluated first and its side effect takes place before evaluating the expressions after itself. Simply the evaluation order is expression by expression.

Be aware of different code path in logical expression
C++ standard guarantee that the functions/operators in logical expression are evaluated in the left-to-right order and a certain optimization is done. For "&&" if the evaluation of the left side is false, then the compiler skips the evaluation of the right side, because it already has the result of the entire logic expression no matter what the right expression is. For "||" if the evaluation of the left side is true, then the compiler skips the evaluation of the right side.

This kind of optimization causes different code path and potentially buries hidden bugs, if the logical expression is made up of function/operator calls and these calls have side effects on the resources/states-of-program (global/automatic variables/object, IOs and etc.).

//Example 4: different code path in logical expression
//********************************************************************************
bool foo(int& x) {
    // modify x
}
bool bar(int& x) {
    // modify x
}
int Square(int x) {
    return x*x;
}

int x{1};
if (foo(x) && bar(x) {
    // ......
    Square(x);
}

// using "x"

if (foo(x) || bar(x)) {
    // ......
    Square(x);
}
// using "x"
//********************************************************************************

In Example 4 depending on the value evaluation of foo(x), there are the different code paths and potential hidden bugs in "&&" case,
    - If foo(x) true, then bar(x) is not called and the argument passed into Square(x) is only modified by f(x).
    - If foo(x) false and bar(x) is evaluated and returns true, then the argument passed into Square(x) is modified by both functions.
It is not difficult to find out the different code paths for the "||" case. If in the true/false branch of "if(...)" the value of "x" is used, then there might be potential bugs.

Wisely use logic operator to avoid performance penalty
As I introduced above, whether the right side of logical expression is evaluated or not depends on the result of the evaluation on the left side. Weighting on the expensiveness of the evaluation of functions in logical expression, carefully considering the evaluation order (the same as the order of functions/operators appearing in logical expressions in this case) prevents the performance penalty.This could play a big role in big data applications.

// Example 5: performance on logical expression
//********************************************************************************
// Let's consider both foo(int) and bar(int) has no side effect
// work like a functor
bool foo(int x); // Expensiveness 9
bool bar(int& x); // Expensiveness 1

for (......) { // big data
    if (foo(x) && bar(x)) {
        // .....
    }
}
//********************************************************************************

In Example 5 let's assume that
    - foo() has expensiveness of 9 to run
    - bar() has expensiveness of 1.to run
    - big data with 1 million (not big)
    - 50% of data evaluate as true on foo() and 50% false
    - 50% of data evaluate as true on bar() and 50% false
Let's examine how costly it is with different orders,
    - cost of "if (foo(x) && bar(x))": 1 million of evaluation of foo() + half million of evaluation of bar(), so total cost = 1 million * 9 + 0.5 million * 1 = 9.5 million
    - cost of "if (bar(x) && foo(x))": 1 million of evaluation of bar() + half million of evaluation of foo(), so total cost = 1 million * 1 + 0.5 million * 9 = 5.5 million

(The same calculation can be done in "||" case.)
See what the math is telling you. Take into account of how expensive the functions are and also getting to know what your business are (statistical knowledge on the result of evaluation on functions/operator). This sort of scenarios has happened to me in my previous projects. For instance given a large list of variables/equations find out the ASSIGNED variables, algebraic variables, differential variables with varying order and so on. There are often multiple conditions that need to verify. In a few cases correctly ordering which function goes first saves me from huge performance penalty. At the same time consider together if these functions called in logical expression are polymorphic functions. (See the performance penalty on polymorphic function, Design Pattern for Scientific Computing - Curiously recurring template pattern VS. dynamic polymorphism.)

Use initializer-list, {}-list, to initialize objects as often as possible
In C++11 initializer-list is introduced to initialize variables/objects. And this should be considered as good practice in C++ coding style. Please refer to my other two blogs for more details on C++11 - Initializer-list (Part I) and C++11 - Initializer-list (Part II).

Initializer-list guarantees that the evaluation order in {}-list is the same order of each element appearing in the {}-list (elements are separated by ","). And value evaluation and side effect take effect before the evaluations of its following elements. And this property holds recursively if all its sub-elements are initialized via {}-list.

//Example 6:
//********************************************************************************
struct Foo {
    Foo(int, int, int);
};

int eoInitVal{ 0 };
Foo f1{ IncrementByTwo(eoInitVal),
             IncrementByFour(eoInitVal),
             IncrementByEight(eoInitVal) }; // Foo f1{2, 6, 14} in VC12
Foo f2(IncrementByTwo(eoInitVal),
            IncrementByFour(eoInitVal),
            IncrementByEight(eoInitVal)); // Foo f2(14, 12, 8) in VC12
//********************************************************************************

"Foo f1" is initialized via {}-list. All its arguments are evaluated from the left-to-right sequence and the value evaluation and side effect take effect before coming to evaluate the next. And it is initialized as we expected by passing {2, 6, 14}. However "Foo f2" is created by creation constructor. It has the same evaluation order as a function call. So the evaluation order is unspecified. In VC12 (Microsoft Visual Studio Express 2013) "Foo f2" is created with the argument list of (14, 12, 8), because VC12 evaluates the function arguments in the backwards order.

Bibliography:
[1] C++11 Standard
[2] http://en.cppreference.com/w/cpp/language/eval_order
[3] http://en.cppreference.com/w/cpp/language/operator_precedence

No comments:

Post a Comment