“Prefer Algorithms to Hand-Written Loops”

The title of this post refers to an oft-repeated C++ dictum — one that I’ve resisted for most of my C++ career. I tried doing things that way, several times, but in the end I always went back to hand-written loops. That changed recently; read on to see why, and more importantly, how.

In C, you used arrays and index-loops, like this (pardon me if some C++ features sneak into my C code, I’ve been using C++ far longer than I ever used plain C):

int v[5]={ 18, 42, 63, 4, 6 }, sum=0;
for (int i=0; i<5; ++i) sum+=v[i];

Nice and straightforward, very readable (if you know C, at least).

But in C++, you’re supposed to use one of the STL’s container classes, and avoid arrays like the plague. There are enough good reasons for that that I don’t quibble about it, so let’s rewrite the above using an std::vector container:

int v_init[5]={ 18, 42, 63, 4, 6 }, sum=0;
std::vector v(v_init, v_init+5);
for (int i=0; i<5; ++i) sum+=v[i];

It’s just as straightforward and readable, but a little ugly because you can’t initialize a vector to a list of constants directly. The result is that we still have to have the array (it’s just not being used for the work), but that’s only necessary here because this is a contrived example. In most code, you don’t need to initialize the vector to a set of constant values, so the problem doesn’t come up.

But the above code isn’t considered a very efficient or proper use of the C++ vector container. Instead of using subscripts in the loop, you’re supposed to use iterators. That makes perfect sense once you consider something like std::set, which can’t be accessed by index, so let’s try it:

int v_init[5]={ 18, 42, 63, 4, 6 }, sum=0;
std::vector v(v_init, v_init+5);
for (std::vector::iterator i=v.begin(); i!=v.end(); ++i) sum+=*i;

Hm… aesthetically displeasing, as well as more typing. A typedef can take care of a lot of that though:

int v_init[5]={ 18, 42, 63, 4, 6 }, sum=0;
typedef std::vector VType;
VType v(v_init, v_init+5);
for (VType::iterator i=v.begin(); i!=v.end(); ++i) sum+=*i;

Not too bad. You’ve got the advantages of the STL’s tried-and-true containers, and not much extra typing. You could easily replace the std::vector with a similar class, such as std::list, and the code will still work. And you now have easy access to a number of STL algorithms, already written and debugged for you.

But that isn’t considered good C++ either, because it’s using a hand-written loop. Let’s replace that with the accepted C++ method of doing the same thing:

int v_init[5]={ 18, 42, 63, 4, 6 }, sum=0;
typedef std::vector VType;
VType v(v_init, v_init+5);
sum=std::accumulate(v.begin(), v.end(), sum);

Now it could go into a C++ textbook. And most of the STL examples that you see are exactly like that, textbook examples that are very simple and straightforward. It makes the STL way of doing things very attractive… until you try it on real-world problems, and discover that you almost always need something a lot more complex:

static std::vector filterForFirstLetter(const std::vector<:string>& source,
    unsigned int keyText)
{
    unsigned int lower=_tolower(keyText), upper=_toupper(keyText);
    std::vector r;
    for (std::vector<:string>::const_iterator i=source.begin(),
        ie=source.end(); i!=ie; ++i)
    {
        unsigned int firstLetter=(*i)[0];
        if (firstLetter==lower || firstLetter==upper) r.push_back(i-source.begin());
    };
    return r;
};

(Apologies again for the formatting, my blog software doesn’t show the leading spaces or tabs.)

(UPDATED: I figured out how to change the formatting — a combination of the code-markup plugin to handle escaping code characters, and the “pre” tag to preserve the formatting. The code in this post should look fine now.)

How exactly would you convert that loop to an algorithm call? It can be done, but it pretty much demands a “functor” — a specially-written helper class with an operator() member function that will make it act like a function. That would at least triple the line-count, as well as making the code a lot less readable… that’s not my idea of an improvement.

Okay, so at least occasionally a hand-written loop is a lot more desirable from a readability perspective. It seemed to me that it was almost always more desirable for many real-world problems, even if it required more typing, because I absolutely refuse to write a functor for each and every niggling little problem I want to solve with a loop over a container. But on Saturday, mentally exhausted after researching a difficult new feature for Project X, I decided to delve into STL algorithms again, to try to discover exactly what I was missing. (Why? Because it’s fun. 🙂 ).

I’ve used the boost::bind class before, to create function adapters for various purposes. It helps immensely when you’ve already got a function that you want to apply to every item in a container, but which doesn’t have the proper signature for the algorithm you want to apply to it — maybe it takes additional parameters, or parameters in a different order. But when you only want to do something simple, and don’t want to create a function or functor to do it?

That’s when I remembered the boost::lambda class. I’d run across it a few times before, but didn’t really see a reason for it then. This proved to be the reason.

Using boost::lambda, I could create these small functions in place, and do nearly anything I needed very simply. This…

template 
class toSharedPtrFunctor {
    public:
    toSharedPtrFunctor(std::vector<:shared_ptr> > &target):
        mTarget(target) { };
    void operator()(T* p) { mTarget.push_back(boost::shared_ptr(p)); };

    private:
    std::vector<:shared_ptr> > &mTarget;
};

template 
static std::vector<:shared_ptr> > toSharedPtr(const std::vector& source) {
    std::vector<:shared_ptr> > target;
    toSharedPtrFunctor f(target);
    std::for_each(source.begin(), source.end(), f);
    return target;
};

…could be expressed much more simply as this…

template 
std::vector<:shared_ptr> > toSharedPtr(const std::vector& source) {
    typedef boost::shared_ptr rtype;
    std::vector r;
    std::transform(source.begin(), source.end(), std::back_inserter(r),
        boost::lambda::constructor());
    return r;
};

(It had required the functor previously, because GCC4 gave an apparently-spurious error when I tried creating the boost::shared_ptr directly.)

The last problem I had was with boost::bind. For some reason, the Boost library puts the bind function and the placeholders (_1, _2, and _3) into the current namespace, rather than the boost namespace. That saves you the need to add a using directive to get easy access to them, but if you want to use the boost::lambda library’s bind functions too, the compiler gets confused and you have to explicitly specify which bind function or placeholder you want to use. Having to write boost::lambda::_1 instead of just _1 every time severely impacts the readability of your code. Fortunately, boost::lambda‘s bind function is pretty much a drop-in replacement for boost::bind, so it’s easy enough to get rid of boost::bind, and the problem’s solved.

That’s one more arrow in my code-writing quiver. 🙂