Monday, 16 June 2014

C++11 Features - Automatic Type Deduction

Automatic type deduction is introduced in C++11 as a new feature. The standard defines that the type information will be deduced by the compiler based on the information extracted from the initialization value. The keyword is auto (was used as storage class specification and now is deprecated in C++11).

1. Problems in C++03
Before C++11, C++ is a strictly type declaration programming language. Every variable/object has to be declared first with the exactly correct type before it is to be used. Its type is checked at both compiling time and run-time. Any mismatch will be either flagged out by compiler or punished at run time (polymorphism). This strict requirement often causes troubles and inconvenience in C++03.

Too much typing on the return type of STL containers
This is no new to anyone, especially for those functions of STL containers return the iterator type. This sort of trait-type related types have to be declared from its explicit instantiation of containers.

// Example 1: some functions of STL containers
//********************************************************************************
std::vector<std::vector<std::string>>> myDoubleVec;
std::vector<std::vector<std::string>>>::const_iterator vecBegin = myDoubleVec.begin();

class UserID;
class UserInfo
UserID userId;
std::map<UserID, UserInfo> myDB;
// ......
std::map<UserID, UserInfo>::iterator pos = myDB.find(userId);
//********************************************************************************

Unfortunately in C++03 the unnecessary typing is still needed anytime when you are trying to access the element in STL container. It is really not necessary to hit the keyboard to declare a long type, because its type can be easily deduced from instantiation of template. Of course for some of STL container instantiations can be typedef-defined to save the typing.  But this is only true for those that have the file or class scope. It is really not a good idea to have typedef-definition for every instantiation, especially for those only needed for automatic variables declaration.

Used in generic/template programming
The problems are like the difficulty to decide the intermediate variable types and the return types. I will not waste more time on this, as Bjarne Stroustrup provides two really good examples in his C++11 FAQ website. And I think this is the key motivation behind the introduction of this feature in C++11.

2. Tips of using auto
C++11 has introduced a new feature - auto to fix the excessive typing issue. auto can be used with variables, function types, STL containers, std::intializer_list, lambda expression and so on. Together work with newly introduced feature decltype. Their combination fix the return type issues in template programming. Please see more on this issue, suffix return type syntax.
Here I would like to introduce you a few tips when using auto in your daily application.

Reference type
Object type and pointer type work fine. There is a bit special about reference type. Have to use "auto &" to refer to a reference type.

// Example 2: 
//********************************************************************************
auto x = 1; // OK
const auto &y = 2;  //OK - pure rvlaue (will be allocated in const read-only memory section) 
auto &z = 2; // Error - lvalue
auto *a = &x; // OK

int Foo();
int& Bar();
int* Cat();
auto b = Foo(); // OK - auto as int
const auto &c = Foo(); // OK - pure rvalue bind to temporary value
auto &d = Foo(); // Error - lvalue
auto e = Bar(); // OK - auto as int
auto& f = Bar(); // OK - auto as int&
const auto& g = Bar(); // OK - const int&
auto h = Cat(); // OK - auto as int*
auto *i = Cat(); // OK - auto as int*
auto *j = Bar(); // Error - Bar() returns reference not poitner
//********************************************************************************

As well keep in mind that only const reference type, regarded as pure rvalue, can bind to a const value or temporary value but lvalue reference type are forbidden to bind to const value and temporary value, as shown in code with orange color in Example 2.

Object slicing in automatic type deduction
Object slicing happens when the object or the reference-to-object of derived class is passed to create an object of base class. It was discussed briefly in this blog Catching exception. Passing a pointer/reference type as augments in function calling can prevent object slicing. But when a function returns a reference type, if object slicing happens or not depends on the type of variable declared to take the returning from the function.

// Example 3: object slicing
//********************************************************************************
class Base {
public:
    virtual const std::string GetName() const {
        return "Base";
    }
};

class Derived : public Base {
public:
    virtual const std::string GetName() const {
        return "Derived";
    }
};

static Derived gDervied;

Base GetBase() {
    return gDervied;
}

Base& GetBaseRef() {
    return gDervied;
}

void Test() {
    auto b = GetBase(); // object slicing - auto as Base
    auto b2 = GetBaseRef(); // object slicing - auto as Base
    auto &bRef = GetBaseRef(); // polymorphic object: Derived - reference type
}
//********************************************************************************

In Test() of Example 3 there is no surprise that the object slicing happens on "auto b" but does not happen to "auto &bRef". "auto b2" is a sliced object, a bit of surprise. Actually should not, if refer to the last sub-section and Example 2.

The deduced variable types must be the same in multi-variable declaration
If there are multiple variables declared in one statement, the variable types deduced from the initialization of variables must be the same. Otherwise the compiler should flag it out.

// Exmaple 4:
//********************************************************************************
auto a = 1, b = 2; // OK
auto x = 1, y = 2.0; //  Error: in VC12
                               /* error C3538: in a declarator-list 'auto' must always deduce to the same type */
auto m = a, &n = a, *p = &b; // OK
//********************************************************************************

Keep in mind that any auto declaration will be fine if it is legal declaration by replacing auto with the concrete type. In the first code statement in Example 4 auto can be replaced by "int". It is also worth pointing out that automatic type deduction has stricter checking. For instance "int x = 1, y = 2.0;" is not an error but a warning (complaining about the narrowing) . However "auto x = 1, y = 2.0;" is an error.

Initialization
As shown in C++11 - Initializer-list (Part I) and C++11 - Initializer-list (Part II), both {}-list and "()" can be used to initialize objects. auto works fine on both. But keep in mind underneath initialization via {}-list have different type from "()".

// Example 5: initialization
//************************************************
auto x(1);   // Case 1
auto x = 1; // Case 2
auto x{1};  // Case 3
//************************************************

In Example 5:
    - auto as int in Case 1
    - auto as int in Case 2
    - auto as std::initializer_list<int> in Case 3
So bear in mind that initialization via {}-list  brings you different types and invokes initializer-list constructor.

Used in range for loop
The functionality of for loop has been enhanced in C++11. The extended capability is called "range-based for", as shown in Example 6.

// Example 6
//********************************************************************************
std::vector<int> vec{1, 2, 3, 4};
for (auto x : vec) {

}
//********************************************************************************

Range-based for can be used in
    - C style array
    - C++11 std::initializer_list<T>
    - C++ STL containers

Return type of template functions
The combination of auto and decltype can be used to declare the return type of template function. The actual return type will be deduced based on the instantiation of template functions/classes by the compiler at compiling time,

// Example 7: from C++11 standard
//********************************************************************************
struct A {
    char g();
    template<class T> auto f(T t) -> decltype(t + g()) {
        return t + g();
    }
};
template auto A::f(int t) -> decltype(t + g());
//********************************************************************************

In Example 7 if the T is instantiated as int, then the return type will be decltype(int+char) - int. Just be aware of that decltype() is different from decltype(()). The latter returns the reference type.

Used with lambda-expression
Lambda expression/function is another key feature introduced in C++11. Certainly I can recall that every time I need implement a class/struct for a predicate-function/functor no matter how trivial the function is. Especially when this predicate-function/functor is only used locally and really no excuse to implement in a file scope, extra excessive typing and decision on which scope to put it are really hassles. With the power of lambda-expression these issues will be gone.

// Example 8: auto and lambda expression
// ********************************************************************************
void Foo() {
    // ......
    auto increment3 = [](int) {return x + 3;};
    int x{0};
    int y = increment3(x) ; // y = 3
}
//********************************************************************************

As lambda expression is an key feature, I would like to discuss it in more detail and intensively. I will discuss this topic in another blog and add a link up here.

3. Summary
How to judge if an expression with auto is legal . Here is the rule of thumb:
    - Replace auto with the concrete type, for instance "int", if the expression after replacement is legal then this expression with auto is legal too.

Google C++ Style Guide describes the pros and cons of automatic type deduction and provides reasonable suggestions on when/where to use auto. In my opinion I would try to avoid using auto as much as as possible and try explicitly declaration as much as possible in order to tell my fellow colleagues what exactly I try to do. In this explicit way certainly it will make the code easier to read and less ambiguous to communicate. Exceptions in only two scenarios to use auto.
    - Used as the return type of STL containers to save typing
    - Used in generic programming (template) together with decltype

Bibliography:
[1] C++03 standard
[2] C++11 standard
[3] http://www.stroustrup.com/C++11FAQ.html
[4] http://en.wikipedia.org/wiki/C%2B%2B11
[5] [N3257=11-0027 ] Jonathan Wakely and Bjarne Stroustrup: Range-based for statements and ADL
[6] http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml
[7] [N2930=09-0120] Douglas Gregor and Beman Dawes: Range-based for loop wording
[8] [N1478=03-0061] Jaakko Jarvi, Bjarne Stroustrup, Douglas Gregor, and Jeremy Siek: Decltype and auto

No comments:

Post a Comment