Archive for June 2016

Range-based for loop vs std::transform

Some time ago, while doing some code review in work I came across the problem whether the loop in the code I was reviewing should be expressed with std::for_each, std::transform or range-based for loop. The initial code looked pretty much like this:

1
2
3
4
5
6
7
8
std::vector<Type> destination;
destination.reserve(source.size());
 
std::for_each(std::begin(source), std::end(source), 
              [&destination](auto const& p_value)
{
    destination.push_back(SomeStruct{p_value.a, p_value.b, p_value.c});
});

The first things that come to my mind when I see such code are “Why are you using std::for_each? Shouldn’t you use range-based for loop?” Or maybe there should be a std::transform?” Let’s consider those three things:

std::for_each

This is an algorithm from standard library that represents standard for loop that goes over a range defined by begin and end iterator and runs given functor or function. An example of usage is presented above. It utilizes lambda expression that invokes push_back on a std::vector to add elements.

range-based for

Given algorithm can be easily converted to this new form of loop (introduced in C++11).

1
2
3
4
5
6
7
std::vector<Type> destination;
destination.reserve(source.size());
 
for(auto const& p_value : source) 
{
    destination.push_back(SomeStruct{p_value.a, p_value.b, p_value.c});
};

This is the same algorithm but in much clearer and less noisy form.

std::transform

Another considered algorithm is std::transform that is supposed to be used (as the name says) to transform one structure to another. Let’s convert given code to utilize this algorithm.

1
2
3
4
5
6
7
8
std::vector<Type> destination;
destination.reserve(source.size());
 
std::transform(std::begin(source), std::end(source), std::back_inserter(destination),
              [](auto const& p_value)
{
    return SomeStruct{p_value.a, p_value.b, p_value.c};
});

It looks pretty much like std::for_each. The difference is in the name, the third argument and that the lambda returns new value instead of inserting in into the new container.

To be honest, when I have to choose between those versions I would choose std::transform because of its explicitness. This algorithm just says what is done. It gets one array and transforms it into another. std::for_each and range-based-for are just simple loops with no more special meaning than “run this function over this array”. From the last two if I had to choose one, then I would prefer range-based-for since its very clean form of loop over an array (no additional noise in form of lambda or std::begin/std::end).

But…

Since we are considering C++11 there’s one more thing to consider when inserting elements into std::vectoremplace_back. If we have a class with constructor and emplace_back then situation changes, because emplace_back allows us to create objects right inside the vector’s memory. In that case using std::transform doesn’t allow such operation because it requires copy (inside std::back_inserter).

Let’s take look at the code with all scenarios:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
#include <iostream>
#include <algorithm>
#include <vector>
 
struct TestStruct
{
    TestStruct(int a, std::string b)
        : m_a(a)
        , m_b(b)
    {
        std::cout << ">> Constructor: a=" << m_a << ", b=" << m_b << std::endl;
    }
 
    TestStruct(TestStruct const& src)
        : m_a(src.m_a)
        , m_b(src.m_b)
    {
        std::cout << ">> Copy Constructor: a=" << m_a << ", b=" << m_b << std::endl;
    }
 
    TestStruct(TestStruct && src)
        : m_a(src.m_a)
        , m_b(std::move(src.m_b))
    {
        std::cout << ">> Move Constructor: a=" << m_a << ", b=" << m_b << std::endl;
    }
 
    ~TestStruct()
    {
        std::cout << ">> Destructor: a=" << m_a << ", b=" << m_b << std::endl;
    }
 
private:
    int m_a;
    std::string m_b;
};
 
struct InputStruct
{
    int a;
    std::string b;
};
 
using namespace std;
 
int main(int argc, char *argv[])
{
    InputStruct input[] = {
        { 1, "one" },
        { 2, "two" },
    };
 
    std::cout << "std::for_each:" << std::endl;
 
    std::vector<TestStruct> va;
    va.reserve(std::end(input) - std::begin(input));
    std::for_each(std::begin(input), std::end(input),
                  [&va](auto const& p_in)
    {
        va.push_back(TestStruct{p_in.a, p_in.b});
    });
 
    std::cout << "\nrange-based-for:" << std::endl;
 
    std::vector<TestStruct> vb;
    va.reserve(std::end(input) - std::begin(input));
    for(auto const& p_in : input)
    {
        vb.push_back(TestStruct{p_in.a, p_in.b});
    }
 
    std::cout << "\nstd::transform:" << std::endl;
    std::vector<TestStruct> vc;
    vc.reserve(std::end(input) - std::begin(input));
    std::transform(std::begin(input), std::end(input), std::back_inserter(vc),
                  [](auto const& p_in)
    {
        return TestStruct{p_in.a, p_in.b};
    });
 
    std::cout << "\nrange-based-for + emplace_back:" << std::endl;
 
    std::vector<TestStruct> vd;
    vd.reserve(std::end(input) - std::begin(input));
    for(auto const& p_in : input)
    {
        vd.emplace_back(p_in.a, p_in.b);
    }
 
    std::cout << "\n\nDestruction of all items:" << std::endl;
}

The output is as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
std::for_each:
>> Constructor: a=1, b=one
>> Move Constructor: a=1, b=one
>> Destructor: a=1, b=
>> Constructor: a=2, b=two
>> Move Constructor: a=2, b=two
>> Destructor: a=2, b=
 
range-based-for:
>> Constructor: a=1, b=one
>> Move Constructor: a=1, b=one
>> Destructor: a=1, b=
>> Constructor: a=2, b=two
>> Move Constructor: a=2, b=two
>> Destructor: a=2, b=
 
std::transform:
>> Constructor: a=1, b=one
>> Move Constructor: a=1, b=one
>> Destructor: a=1, b=
>> Constructor: a=2, b=two
>> Move Constructor: a=2, b=two
>> Destructor: a=2, b=
 
range-based-for + emplace_back:
>> Constructor: a=1, b=one
>> Constructor: a=2, b=two
 
 
Destruction of all items:
>> Destructor: a=1, b=one
>> Destructor: a=2, b=two
>> Destructor: a=1, b=one
>> Destructor: a=2, b=two
>> Destructor: a=1, b=one
>> Destructor: a=2, b=two
>> Destructor: a=1, b=one
>> Destructor: a=2, b=two

The output from the app reveals everything. The emplace_back version produces the smallest output, just as suspected, and all other loops does exactly the same thing.

What about POD?

Of course the previous version works smoothly only with classes that defines the constructor which can be used within emplace_back. When considering POD structs that doesn’t have one, the trick won’t work and won’t even compile. When adding POD object into vector it requires explicit use of the struct name when creating new objects that goes to the container.

In such case, in my opinion, the std::transform is the best way to express the intent.