C++: The Worst of Both Worlds

C++ is certainly one of the most complex programming languages around. A particularly gnarly aspect of C++ are the templates: as a late addition to the language, they had to fit within the existing syntax. On top of that, using the STL approach, the C++ language designers discovered that templates were far more powerful than initially thought. In other words, they accidentally created a very pure functional language. Unfortunately, this accident of birth also means that its functional syntax is barely usable. Even so, given its power and the need for advanced C++ techniques, the templating part of C++ keeps amassing ever more features and even more power.

Dev's Corner

Spiria's Dev Team

2021-02-03 11:41

•

5 minute read

<div><p>In contrast, Python is often described as one of the easiest programming languages to learn and read. It is also very dynamic, allowing any data to pass around to any function very easily. Better yet, functions can be defined and re-defined at run-time. When you call a function, you are never sure what code will <i>really</i> be executed.</p><p>So, why not combine the two? Combine the complexity and obscure syntax of C++ templates with Python dynamic typing and dynamic extensibility? If this kind of revolting brew interests you, then I’ve got something for you!</p><h2>Now, Seriously…</h2><p>The real story is that I wanted to have overloaded function resolution at run-time instead of compile-time.</p><p>I wanted to create a dynamic function-dispatch system that could call a function with any data. I wanted this system to be dynamically extensible to allow new overloads of the function for new data types at any time. I also wanted to be able to call the functions with either concrete data or a bunch of <code>std::any</code>. I also wanted all this to be reasonably efficient.</p><p>To achieve all these goals, I turned to templates. Not just simple ones though, but rather the more complex variadic templates.</p><h2>Variadic Template Syntax</h2><p>What’s a variadic template? Normal templates require a fixed number of types as argument. That’s fine if you know in advance how many types you’ll need. Variadic templates, on the other hand, accept a variable number of types as argument. It can be zero, one, two, or any number of types.</p><p>Besides receiving these types, templates must also be able to use them. As you may already know, templates are compile-time beasts. They need to work without modifying any data. Therefore, to manipulate a variable number of types, C++ needed a new syntax. The new syntax for both receiving the types and using them was created around the ellipsis: <code>...</code>.</p><p>The basic idea is that whenever the ellipsis is used, the C++ compiler knows it has to repeat the surrounding piece of code as many times as needed for each type. For example, the types of the variadic template are received with the ellipsis. In the following example, the <code>VARIA</code> template argument represents any number of types.</p><pre><code>template <class... VARIA>struct example{ // template implementation.};</code></pre><p>Later on, in the template implementation, the variadic type arguments can be used in the code with the ellipsis. For example, the variadic template above could have a function that receives arguments of the corresponding types and passes these values into a call to another function, like this:</p><pre><code>// Receive a variable number of arguments…void foo(VARIA... function_arguments){ // … and pass them on to another function. other_bar_function(function_arguments...);}</code></pre><p>These examples only scratch the surface of what is possible with variadic templates, but they are sufficient for the purposes of this article.</p><h2>Dynamic Dispatch Design</h2><p>Before delving into the design of our dynamic dispatch, we need to outline our requirements more precisely. I said the dynamic dispatch should mimic the compile-time function overload of C++. What does that mean exactly? Well, here are the features of an ideal design:</p><ul> <li>The function itself is declared at compile-time and is referred to by its name, like a normal function.</li> <li>The number of arguments of a given function can vary.</li> <li>Each overload of a function can return different types of values.</li> <li>Each such function can be overloaded for any type.</li> <li>New function overload can be added dynamically, at run-time, for any type.</li></ul><p>While these requirements would be sufficient for our purposes, there were a few additional use cases I wanted to cover. The first was to support function arguments that always have the same type. For example, a text-streaming function would always receive a <code>std::ostream</code> argument. The second was to be able to select a function implementation without having to pass a value to the function. This would allow specifying the return type of the function or implementing functions that take no argument at all. I’ll be showing you an example of each of these cases later on.</p><p>To support these use cases, we added two features to the list:</p><ul> <li>Not all arguments have to play a part in the type-based selection of the function.</li> <li>Some additional types <i>can</i> play a part in the typed-based selection of the function without being an argument.</li></ul><p>The result should look like a compile-time function overload. For example, here is how a call to the dynamic-dispatch <code>to_text</code> function looks like:</p><pre><code>std::wstring resultat = to_text(7);// result == "7"std::any seven(7);std::wstring resultat = to_text(seven);// result == "7"</code></pre><p>This apparent simplicity is supported by a lot of complex code behind the scenes.</p><h2>Smooth Operator</h2><p>Before demonstrating the implementation of the dynamic dispatch, we’ll show what it looks like from the viewpoint of the programmer creating a new operation. How do we create a new function?</p><p>To create a new operation called <code>foo</code>, declare a class to represent it. For the purposes of our example, we named it <code>foo_op_t</code>, derived from <code>op_t</code>. The <code>foo_op_t</code> class identifies the operation. It can be entirely empty. Afterward, we can write the <code>foo</code> function, the real entry-point for the operation. That is the function that the user of the <code>foo</code> operation will call. This function only needs to call <code>call<>::op()</code> (for concrete values) or <code>call_any<>::op()</code> (for <code>std::any</code> values), both of which are found in <code>foo_op_t</code>, which takes care of the dynamic dispatch:</p><pre><code>struct foo_op_t : op_t<foo_op_t> { /* empty! */ };inline std::any foo(const std::any& arg_a, const std::any& arg_b){ return foo_op_t::call_any<>::op(arg_a, arg_b);}template<class A, class B, class RET>inline RET foo(const A& arg_a, const A& arg_b){ std::any result = foo_op_t::call<>::op(arg_a, arg_b); // Note: we could test if the std::any really contains // a RET, instead of blindly trusting it. return any_cast<RET>(result);}</code></pre><p>Note that the base class of the new operation takes the operation itself as a template parameter. This is a well-known trick in template programming. In fact, it is so well-known that it even has a name: the curiously recursive template pattern. In our case, this trick is used so that the <code>op_t</code> can refer to the specific operation being used.</p><p>Now, we can create overloads of the <code>foo</code> operation. This is done by calling <code>make<>::op</code> with a function that implements the overload. To create an overload that takes types <code>A</code> and <code>B</code> and returns the type <code>RET</code>, we call <code>make<>::op<RET, A, B></code>. This registers the overload in the <code>foo_op_t</code> class. As an example, let’s implement our <code>foo</code> operation for the type <code>int</code> and <code>double</code> and make it return a <code>float</code>:</p><pre><code>// Some code in your program that implements the operation.float foo_for_int_and_double(int i, double d){ return float(i + d);}// Registration!foo_op_t::make<>::op<float, int, double>(foo_for_int_and_double);</code></pre><p>Of course, we could make the code shorter by writing the implementation right there in the call to <code>make<>::op</code>, with a lambda:</p><pre><code>foo_op_t::make<>::op<float, int, double>( [](int i, double d) -> float { return float(i + d); });</code></pre><p>In case you were wondering why the <code>call<></code> and <code>make<></code> take the template sigils, it’s because they are themselves variadic templates. The optional template arguments are the extra selection types used to choose a more specific overload based on types that are not passed as an argument to the <code>foo</code> operation. We will explain this in greater detail later.</p><p>Now, we are finally ready to get into the meat of the subject: implementing the dynamic function dispatch.</p><h2>Enter Selector</h2><p>The first problem to tackle is how each overload is identified within a function family. The obvious solution is to identify it by the types, or by its argument and extra selection types. C++ provides the <code>std::type_info</code> and <code>std::type_index</code> to identify a type. What we need is a <code>tuple</code> of these <code>type_index</code>. We achieve that with a pair of templates: the type converter and the selector.</p><p>The type converter maps any type to <code>std::type_index</code>. It is a very idiomatic trick in template programming, where each step in an algorithm is implemented in a type so that it can be executed at compile-time. Below is the converter, converting any type <code>A</code> into a <code>type_index</code> or <code>std::any</code>:</p><pre><code>template <class A>struct type_converter_t{ using type_index = std::type_index; using any = std::any;};</code></pre><p>The full type selector can then be written as a variadic template by applying the converter to all types given as argument and declaring a <code>tuple</code> type named <code>selector_t</code> with the result. It uses both the functions argument type, <code>N_ARY</code>, and the extra selection types, <code>EXTRA_SELECTORS</code>, to create the full selector.</p><pre><code>template <class... EXTRA_SELECTORS>struct op_selector_t{ template <class... N_ARY> struct n_ary_t { // The selector_t type is a tuple of type_index. using selector_t = std::tuple< typename type_converter_t<EXTRA_SELECTORS>::type_index..., typename type_converter_t<N_ARY>::type_index...>; };};</code></pre><p>Note how the ellipsis is applied to the line:</p><pre><code>typename type_converter_t<EXTRA_SELECTORS>::type_index...</code></pre><p>How the C++ language applies the ellipsis is part of the black magic of variadic templates. Sometimes, you will have to go by trial and error to see what works and what doesn’t.</p><p>Now we have a selector type, but how do we use it? To do this, we provide a few functions. The goal is to have a function that creates a selector pre-filled with concrete types. Naturally, we call our function <code>make</code>:</p><pre><code>template <class... EXTRA_SELECTORS>struct op_selector_t{ template <class... N_ARY> struct n_ary_t { template <class A, class B> static selector_t make() { return selector_t( std::type_index(typeid(EXTRA_SELECTORS))..., std::type_index(typeid(N_ARY))...); } };};</code></pre><p>Since we want to support calls with <code>std::any</code>, we need to provide a <code>make_any</code> function with <code>std::any</code> as input. (For optimization purposes, a version with the extra selector already converted to <code>type_index</code> is provided and named <code>make_extra_any</code>, but it is not shown here.)</p><pre><code>static selector_t make_any(const typename type_converter_t<N_ARY>::any&... args){ return selector_t( std::type_index(typeid(EXTRA_SELECTORS))..., std::type_index(args.type())...);}</code></pre><h2>Diving into Delivery</h2><p>We can finally dive into the mechanical details of the registration and calling of the operations. The operation base class is declared as a template taking the operation itself and a list of optional unchanging extra arguments, <code>EXTRA_ARGS</code>, which have fixed types. (Remember our earlier streaming operation example, which always received a <code>std::ostream</code>.)</p><pre><code>template <class OP, class... EXTRA_ARGS>struct op_t{ // Internal details will come next...};</code></pre><p>Let’s first show a few types that are used repeatedly: the selector class, <code>op_sel_t</code>, the selector tuple, <code>selector_t</code> and the internal function signature of the operation, <code>op_func_t</code>.</p><pre><code>using op_sel_t = typename op_selector_t<EXTRA_SELECTORS...>::template n_ary_t<N_ARY...>;using selector_t = typename op_sel_t::selector_t;using op_func_t = std::function<std::any(EXTRA_ARGS ..., typename type_converter_t<N_ARY>::any...)>;</code></pre><p>This illustrates some of the inherent complexity of template programming. Many of its parts would normally be totally unnecessary, but are nevertheless required due to the internal workings of templates. For example, <code>typename</code> is necessary to tell the compiler that what follows really is a type. This happens when a template refers to elements of another template. The C++ syntax is too ambiguous to let the compiler infer that we are using a type. Another very peculiar aspect is the extra <code>template</code> keyword right before accessing <code>n_ary_t</code>: it is needed to let the compiler know that it really is a template.</p><p>We’re now ready to explain the whole system, which is put together with just a few functions:</p><ul> <li>A public way to call the operation: <code>call<>::op</code></li> <li>A public way to make a new overload: <code>make<>::op</code></li> <li>A private way to lookup the correct overload: <code>get_ops</code></li></ul><p>We will tackle each in reverse order, from the lowest implementation details up to the final operation: calling an overload.</p><h2>Keeper of Wonders</h2><p>The lowest implementation detail is the function that holds the available, pre-registered overloads. There is a very important reason why <code>get_ops</code> needs to exist: the problem is that the overloads need to be kept in a container, while the operation base class is a template. We cannot keep all overloads for all operations together. Fortunately, the C++ language specifies that a static variable contained in a function in a template is specific to each instantiation of the template. This lets us hide the registration location within. The <code>get_ops</code> safely holds our list of overloads:</p><pre><code>template <class SELECTOR, class OP_FUNC>static std::map<SELECTOR, OP_FUNC>& get_ops(){ static std::map<SELECTOR, OP_FUNC> ops; return ops;}</code></pre><p>The fact that it is templated over <code>SELECTOR</code> and <code>OP_FUNC</code> allows the operation to be overloaded for any number of arguments.</p><h2>Making Up Your Op</h2><p>The <code>make<>::op</code> function is a template that takes a concrete overload for concrete types, wraps it into the internal function signature and registers it. The wrapping takes care of converting the <code>std::any</code> arguments into concrete types. This is safe, since the concrete overload for these concrete types is only ever called when the types match. This is where the optional extra selection types may be given as the <code>EXTRA_SELECTORS</code> template arguments.</p><pre><code>template <class... EXTRA_SELECTORS>struct make{ template <class RET, class... N_ARY> static void op( std::function<RET(EXTRA_ARGS... extra_args, N_ARY... args)> a_func) { // Wrapper kept as a lambda mapping the internal // function signature to the concrete function signature. op_func_t op( [a_func]( EXTRA_ARGS... extra_args, const typename type_converter_t<N_ARY>::any&... args) -> std::any { // Conversion to concrete argument types. return std::any(a_func(extra_args..., *std::any_cast<N_ARY>(&args)...)); } ); // Registration. auto& ops = get_ops<selector_t, op_func_t>(); ops[op_sel_t::make()] = op; }};</code></pre><h2>Call Me Up, Call My Op</h2><p>We finally reach the function used to dispatch a call. There are three versions of the function. The thing that differentiates them is whether the arguments have already been converted to <code>std::any</code> or <code>std::type_index</code>. The <code>call<>::op</code> function needs to do a few things:</p><ul> <li>Create a selector from the types of its arguments, plus the optional extra selectors.</li> <li>Retrieve the list of available overloads.</li> <li>Lookup the function overload using the selector.</li> <li>Return an empty value if no overload matches the arguments.</li> <li>Call the function if an overload matches the arguments.</li></ul><pre><code>template <class... EXTRA_SELECTORS>struct call{ template <class... N_ARY> static std::any op(EXTRA_ARGS... extra_args, N_ARY... args) { // The available overloads. const auto& ops = get_ops<selector_t, op_func_t>(); // Try to find a matching overload. const auto pos = ops.find(op_sel_t::make()); // Return an empty result if no overload matches. if (pos == ops.end()) return std::any(); // Call the matching overload. return pos->second(extra_args..., args...); }};</code></pre><h2>Wrapping Up</h2><p>This completes the description of the dynamic dispatch design and its implementation. The source code repo contains multiple examples of operations with a complete suite of tests.</p><p>The examples of operations are:</p><ul> <li><code>compare</code>, a binary operation to compare two values.</li> <li><code>convert</code>, a unary operation to convert a value to another type. This is an example of an operation with an extra selector argument, the final type of the conversion.</li> <li><code>is_compatible</code>, a nullary operation that takes two extra selection types to verify if one can be converted to the other.</li> <li><code>size</code>, a unary operation that returns the number of elements in a container, or zero if no overload was found.</li> <li><code>stream</code>, a unary operation to write a value to a text stream. This is an example of an operation with an extra unchanging argument, the destination <code>std::ostream</code>.</li> <li><code>to_text</code>, a unary operation that converts a value to text.</li></ul><p>The whole code base is found in the <code>any_op</code> library that is part of my <a href="https://github.com/pierrebai/dak_utility">dak_utility repo</a>.</p></div>

Want to Work Together?

Every great project starts with a conversation.

Illustration of two people shaking hands